官方参考:https://docs.oracle.com/cd/E11882_01/server.112/e40771/toc.htm
http://www.oracle.com/technetwork/cn/tutorials/smiley-fsfo-088047-zhs.html
环境概述:搭好Dataguard,搭建参考:https://blog.51cto.com/snowhill/1923591
源:db_name:db rac
备:db_name:sbdb1 单机

1 启用dgmgrl
相关参数:
dg_broker_start
dg_broker_config_file1

配置:
alter system set dg_broker_start=true scope=both;
此时数据库会增加dmon进程

[oracle@12crac2 ~]$ ps -ef|grep -i _dmon|grep -v grep
oracle     2699      1  0 06:57 ?        00:00:01 ora_dmon_db1

对应log:$ORACLE_BASE/diag/rdbms/xxx/xxxx/drc$ORACLE_SID.log
但此时使用是用不了的,如下所示:

DGMGRL> show configuration
ORA-16532: Data Guard broker configuration does not exist

需手动配置添加:

DGMGRL> help create 
DGMGRL> create configuration 'db' as primary database is db connect identifier is db;
DGMGRL>help add
DGMGRL>add database "SBDB1" as connect identifier is "sbdb1"  maintained as physical;
DGMGRL>enable configuration

注:add database 'SBDB1' ,这里的dg是指database的dbuniquename,而as connect identifier is sbdb1这里的sbdb1是指tnsname.ora连接到standby database的net service name.注意区分大小,默认为小写;
同时检查alert_$ORACLE_SID.log日志可以看到:

RSM0 started with pid=48, OS id=34551 
ALTER SYSTEM SET log_archive_dest_1='location=USE_DB_RECOVERY_FILE_DEST','valid_for=(ALL_LOGFILES, ALL_ROLES)' SCOPE=BOTH;
ALTER SYSTEM SET log_archive_trace=0 SCOPE=BOTH SID='db1';
ALTER SYSTEM SET log_archive_format='%t_%s_%r.dbf' SCOPE=SPFILE SID='db1';
ALTER SYSTEM SET standby_file_management='AUTO' SCOPE=BOTH SID='*';
ALTER SYSTEM SET archive_lag_target=0 SCOPE=BOTH SID='*';
ALTER SYSTEM SET log_archive_max_processes=4 SCOPE=BOTH SID='*';
ALTER SYSTEM SET log_archive_min_succeed_dest=1 SCOPE=BOTH SID='*';

检查drc$ORACLE_SID.log:

Creating Data Guard Broker Monitor Process (DMON)
04/03/2018 23:26:33
>> Starting Data Guard Broker bootstrap <<
Broker Configuration File Locations:
      dg_broker_config_file1 = "/u01/app/oracle/product/11.2.0/dbhome_1/dbs/dr1SBDB1.dat"
      dg_broker_config_file2 = "/u01/app/oracle/product/11.2.0/dbhome_1/dbs/dr2SBDB1.dat"
04/03/2018 23:26:38
Broker Configuration:       "db"
      Protection Mode:            Maximum Performance
      Fast-Start Failover (FSFO): Disabled, flags=0x0, version=0
      Primary Database:           db (0x01010000)
      Standby Database:           SBDB1, Enabled Physical Standby (0x02010000)

所有配置完成,这玩意只能用在企业版
二 检查配置:

DGMGRL> show database sbdb1
Object "sbdb1" was not found
DGMGRL> show database SBDB1  
Object "sbdb1" was not found

注意大小写

DGMGRL> show database 'SBDB1' 
Database - SBDB1 
  Role:            PHYSICAL STANDBY
  Intended State:  APPLY-ON
  Transport Lag:   0 seconds (computed 0 seconds ago)
  Apply Lag:       0 seconds (computed 0 seconds ago)
  Apply Rate:      0 Byte/s
  Real Time Query: ON
  Instance(s):
    SBDB1 
Database Status:
SUCCESS

三 相关测试:snapshot standby,switchover
3.1不开database flashback on 测试
不用dgmrl的方法参见:https://blog.51cto.com/snowhill/2047857

DGMGRL> convert database 'SBDB1' to snapshot standby;
Converting database "SBDB1" to a Snapshot Standby database, please wait...
Database "SBDB1" converted successfully

这里从alert_sbdb1.log里看,数据库并没有发生重启,只是将会话杀掉了;

SQL> select open_mode, database_role, protection_mode,flashback_on from v$database;
 OPEN_MODE            DATABASE_ROLE    PROTECTION_MODE      FLASHBACK_ON
-------------------- ---------------- -------------------- ------------------
READ WRITE           SNAPSHOT STANDBY MAXIMUM PERFORMANCE  **RESTORE POINT ONLY**
SQL> truncate table system.test; 
Table truncated. 
SQL> select count(*) from system.test; 
  COUNT(*)
----------
         0               
DGMGRL>convet database 'SBDB1' to physical standby;
Unable to connect to database
ORA-12545: Connect failed because target host or object does not exist 
Failed.
Warning: You are no longer connected to ORACLE. 
Please complete the following steps and reissue the CONVERT command:
        shut down instance "SBDB1" of database "SBDB1"
        start up and mount instance "SBDB1" of database "SBDB1"

手动启动备库到mount状态,后面的dgmgrl自动恢复到sbdb1,但在open过程中经常会死在最后一步,相关日志如下:

ALTER DATABASE CONVERT TO PHYSICAL STANDBY (SBDB1)
Killing 3 processes with pids 3013,3017,3019 (all RFS) in order to disallow current and future RFS connections. Requested by OS process 3025
Flashback Restore Start
Flashback Restore Complete
Drop guaranteed restore point 
Guaranteed restore point  dropped
Clearing standby activation ID 1736364983 (0x677ed3b7)
The primary database controlfile was created using the
'MAXLOGFILES 192' clause.
There is space for up to 188 standby redo logfiles
Use the following SQL commands on the standby database to create
standby redo logfiles that match the primary database:
ALTER DATABASE ADD STANDBY LOGFILE 'srl1.f' SIZE 52428800;
ALTER DATABASE ADD STANDBY LOGFILE 'srl2.f' SIZE 52428800;
ALTER DATABASE ADD STANDBY LOGFILE 'srl3.f' SIZE 52428800;
ALTER DATABASE ADD STANDBY LOGFILE 'srl4.f' SIZE 52428800;
ALTER DATABASE ADD STANDBY LOGFILE 'srl5.f' SIZE 52428800;
Waiting for all non-current ORLs to be archived...
All non-current ORLs have been archived.
Clearing online redo logfile 1 /u01/app/oracle/oradata/SBDB1/SBDB1/onlinelog/o1_mf_1_fd4wmho5_.log
Clearing online log 1 of thread 1 sequence number 3
Clearing online redo logfile 1 complete
Clearing online redo logfile 2 /u01/app/oracle/oradata/SBDB1/SBDB1/onlinelog/o1_mf_2_fd4wmjq3_.log
Clearing online log 2 of thread 1 sequence number 4
Clearing online redo logfile 2 complete
Completed: alter database convert to physical standby
Wed Apr 04 01:15:50 2018
Primary database is in MAXIMUM PERFORMANCE mode
RFS[3]: Assigned to RFS process 3029
RFS[3]: Selected log 5 for thread 1 sequence 70 dbid 1729483220 branch 965667412
Wed Apr 04 01:16:11 2018
RFS[4]: Assigned to RFS process 3031
RFS[4]: Selected log 6 for thread 1 sequence 69 dbid 1729483220 branch 965667412
Wed Apr 04 01:16:11 2018
Expanded controlfile section 11 from 28 to 203 records
Requested to grow by 175 records; added 7 blocks of records
Archived Log entry 29 added for thread 1 sequence 69 ID 0x6715d4d4 dest 1:
Wed Apr 04 01:16:13 2018
ARC2: Becoming the active heartbeat ARCH
Wed Apr 04 01:17:13 2018
alter database open
Data Guard Broker initializing...

最后一步会假死一下,没关系,直接shutdown abort,再启动,就可以了;由于dgmgrl的存在,他会自动补上命令,如下图所示:
oracle 11g dataguard之dgmgrl

SQL> select open_mode, database_role, protection_mode,flashback_on from v$database; 
OPEN_MODE            DATABASE_ROLE    PROTECTION_MODE      FLASHBACK_ON
-------------------- ---------------- -------------------- ------------------
READ ONLY WITH APPLY PHYSICAL STANDBY MAXIMUM PERFORMANCE  NO
SQL> select count(*) from system.test; 
  COUNT(*)
----------
     15386

为啥会自动,检查数据库配置:

DGMGRL> show database verbose 'SBDB1'; 
Database - SBDB1 
  Role:            PHYSICAL STANDBY
  Intended State:  APPLY-ON
  Transport Lag:   0 seconds (computed 1 second ago)
  Apply Lag:       0 seconds (computed 1 second ago)
  Apply Rate:      0 Byte/s
  Real Time Query: ON

3.2 启用Fast start failover

DGMGRL> enable fast_start failover
Error: ORA-16651: requirements not met for enabling fast-start failover

Failed.
DGMGRL> exit
[oracle@12crac2 ~]$ oerr ora 16651
16651, 0000, "requirements not met for enabling fast-start failover"
// *Cause:  The attempt to enable fast-start failover could not be completed
//          because one or more requirements were not met:
//          - The Data Guard configuration must be in either MaxAvailability
//            or MaxPerformance protection mode.
//          - The LogXptMode property for both the primary database and
//            the fast-start failover target standby database must be
//            set to SYNC if the configuration protection mode is set to
//            MaxAvailability mode.
//          - The LogXptMode property for both the primary database and
//            the fast-start failover target standby database must be
//            set to ASYNC if the configuration protection mode is set to
//            MaxPerformance mode.
//          - The primary database and the fast-start failover target standby
//            database must both have flashback enabled.
//          - No valid target standby database was specified in the primary
//            database FastStartFailoverTarget property prior to the attempt
//            to enable fast-start failover, and more than one standby
//            database exists in the Data Guard configuration.

简单点来说就是:
1确保broker配置为运行在Max Availability模式。
2在primary和standby机器上都启用flashback database,这个在reinstate failed的数据库的时候要用。
3启动observer

启用database flashback 再次enable:

DGMGRL> enable fast_start failover
Enabled.
DGMGRL> show configuration 
Configuration - db 
  Protection Mode: MaxAvailability
  Databases:
    db    - Primary database
      Warning: ORA-16819: fast-start failover observer not started 
    SBDB1 - (*) Physical standby database
      Warning: ORA-16819: fast-start failover observer not started 
Fast-Start Failover: ENABLED 
Configuration Status:
WARNING
[oracle@12crac2 ~]$ oerr ora 16819
16819, 0000, "fast-start failover observer not started"
// *Cause:  The observer for fast-start failover was not started. As a
//          result, fast-start failover could not happen in the case of
//          a primary database failure.
// *Action: Start the fast-start failover observer by using, for example, the
//          DGMGRL START OBSERVER command.

启动observer:

DGMGRL> start observer
Observer started

再次查看配置:

DGMGRL> show configuration 
Configuration - db 
  Protection Mode: MaxAvailability
  Databases:
    db    - Primary database
    SBDB1 - (*) Physical standby database 
Fast-Start Failover: ENABLED 
Configuration Status:
SUCCESS

3.3 switchover
手工switchover 步骤:https://blog.51cto.com/snowhill/1951592
现在更简化下,用dgmgrl只需一个命令:SWITCHOVER TO <standby database name>;

DGMGRL> switchover to 'SBDB1';
Performing switchover NOW, please wait...
Operation requires a connection to instance "SBDB1" on database "SBDB1"
Connecting to instance "SBDB1"...
Connected.
New primary database "SBDB1" is opening...
Operation requires startup of instance "db1" on database "db"
Starting instance "db1"...
Unable to connect to database
ORA-12514: TNS:listener does not currently know of service requested in connect descriptor

Failed.
Warning: You are no longer connected to ORACLE.

Please complete the following steps to finish switchover:
        start up instance "db1" of database "db"

这个failed不用担心,手动启动一下原主库就可以完成切换了;
切换完了,检查下:

DGMGRL> show configuration 
Configuration - db 
  Protection Mode: MaxAvailability
  Databases:
    SBDB1 - Primary database
    db    - (*) Physical standby database 
Fast-Start Failover: ENABLED 
Configuration Status:
SUCCESS

再切回来:

DGMGRL> switchover to db
Performing switchover NOW, please wait...
Operation requires a connection to instance "db1" on database "db"
Connecting to instance "db1"...
Connected.
New primary database "db" is opening...
Operation requires startup of instance "SBDB1" on database "SBDB1"
Starting instance "SBDB1"...

如还有啥疑问看官方参考:https://docs.oracle.com/cd/E11882_01/server.112/e40771/sofo.htm#DGBKR385

四 dgmgrl 调整参数:
4.1调整应用延迟
edit database 'SBDB1' set property DelayMins=1 ;
其实就是如下命令:

ALTER SYSTEM SET log_archive_dest_2='service="sbdb1"','LGWR ASYNC NOAFFIRM delay=1 optional compression=disable max_failure=0 max_connections=1 reopen=300 db_unique_name="SBDB1" net_timeout=30','valid_for=(all_logfiles,primary_role)' SCOPE=BOTH;
ALTER SYSTEM SWITCH ALL LOGFILE start (db1)
ALTER SYSTEM SWITCH ALL LOGFILE complete (db1)

4.2调整异步同步模式
edit database 'SBDB1' set property LogXptMode=sync;
对应的命令如下:
ALTER SYSTEM SET log_archive_dest_2='service="sbdb1"','LGWR SYNC AFFIRM delay=1 optional compression=disable max_failure=0 max_connections=1 reopen=300 db_unique_name="SBDB1" net_timeout=30','valid_for=(all_logfiles,primary_role)' SCOPE=BOTH;
当然所有能用broker配置的参数,不要用sqlplus来干;
五 相关报错处理

broker错误一般为大小写不一致,数据库配置和broker配置不一致引起,启用broker后,dataguard相关的配置都最好用broker管理;
5.1 ora-16541

 DGMGRL> show configuration
 ORA-16541: database is not enabled 

此错误出现一个节点,一般是由于配置不一致引起

[oracle@12crac1 dbs]$ oerr ora 16541
16541, 00000, "database is not enabled"
// *Cause:  The database specified in the request was not enabled.
// *Action: Select an enabled database and reissue the request.

再另一个节点上查看,一般报的错不一样

DGMGRL> show configuration
Configuration - db 
  Protection Mode: MaxAvailability
  Databases:
    db    - Primary database
      Warning: ORA-16792: configurable property value is inconsistent with database setting 
    SBDB1 - Physical standby database
      Warning: ORA-16792: configurable property value is inconsistent with database setting 
Fast-Start Failover: DISABLED

此错误为参数与dgmgrl 里的不一致引起,检查启用dg broker之后的配置,发现更改改了log_file_name_convert,db_file_name_convert引起,在另一个节点做如下配置就可以了

DGMGRL> edit database db set property LogFileNameConvert='/u01/app/oracle/oradata/SBDB1/,+data';
Property "logfilenameconvert" updated
DGMGRL> edit database db set property DbFileNameConvert='/u01/app/oracle/oradata/SBDB1/,+data';
Property "dbfilenameconvert" updated
DGMGRL> edit database 'SBDB1' set property DbFileNameConvert='+data,/u01/app/oracle/oradata/SBDB1/';
Property "dbfilenameconvert" updated
DGMGRL> edit database 'SBDB1' set property LogFileNameConvert='+data,/u01/app/oracle/oradata/SBDB1/';
Property "logfilenameconvert" updated
DGMGRL> enable configuration
DGMGRL> show configuration 
Configuration - db 
  Protection Mode: MaxAvailability
  Databases:
    db    - Primary database
    SBDB1 - Physical standby database 
Fast-Start Failover: DISABLED 
Configuration Status:
SUCCESS

5.2 ora-16820

DGMGRL> show configuration
Configuration - db
Protection Mode: MaxAvailability
Databases:
SBDB1 - Primary database
Error: ORA-16820: fast-start failover observer is no longer observing this database
db - (*) Physical standby database
Error: ORA-16820: fast-start failover observer is no longer observing this database
Fast-Start Failover: ENABLED
Configuration Status:
ERROR

此错误比较扯淡,stop observer,start observer就可以了;

5.3 ora-16661

启用fast failover后,非正常关闭两个节点,再次启动数据库,一般会出现两个节点都是primary 的情况,这时启动dgmgrl会报:

DGMGRL> show configuration
Configuration - db
Protection Mode: MaxAvailability
Databases:
SBDB1 - Primary database
Error: ORA-16820: fast-start failover observer is no longer observing this database
db - (*) Physical standby database (disabled)
ORA-16661: the standby database needs to be reinstated
Fast-Start Failover: ENABLED

Configuration Status:
ERROR
而另外一节点报错如下:
SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-16649: possible failover to another database prevents this database from
being opened
检查数据库状态也不怎么对:
SQL> select database_role,FLASHBACK_ON from v$database;

DATABASE_ROLE FLASHBACK_ON


PRIMARY YES

DGMGRL> show configuration
ORA-16795: the standby database needs to be re-created
看到此错误也别真去recreated standby,在确认你的配置上没有问题后,直接在另外一个节点:
DGMGRL> reinstate database db
Reinstating database "db", please wait...
Reinstatement of database "db" succeeded
这时后启动的节点会变成physical standby,再次stop observer,start observer就可以了;

5.4 Error: ORA-16525: the Data Guard broker is not yet available
主库或者备的库的db_broker_start参数设置不对;
5.5 ORA-16797: database is not using a server parameter file
没用spfile启动数据库

5.6 ora-16610 主库或者备库的dg_broker_config_file_1/2参数设置不对

 DGMGRL> show configuration

Configuration - 1

  Protection Mode: MaxPerformance
  Databases:
    prod2   - Primary database
    standby - Physical standby database

Fast-Start Failover: DISABLED

Configuration Status:
ORA-16610: command "Broker operation" in progress
DGM-17017: unable to determine configuration status

查看drcstandby.log

drcx: cannot open configuration file "/home/oracle/broker/standby2.data"
  ORA-27037: unable to obtain file status
  Linux-x86_64 Error: 2: No such file or directory
  Additional information: 3
08/02/2018 10:52:38
drcx: cannot open configuration file "/home/oracle/broker/standby1.dat"
  ORA-27037: unable to obtain file status
  Linux-x86_64 Error: 2: No such file or directory
  Additional information: 3
DMON Registering service standby_DGB with listener(s)
08/02/2018 10:52:41

解决办法:

dgmgrl>disable configuration;
sql>alter system set dg_broker_start=false;
sys@standby>alter system set dg_broker_config_file1='/home/oracle/broker/standby1.dat';  
sys@standby>alter system set dg_broker_config_file2='/home/oracle/broker/standby2.dat'; 
DGMGRL> enable configuration
DGMGRL> show configuration 
Configuration - 1 
  Protection Mode: MaxPerformance
  Databases:
    prod2   - Primary database
    standby - Physical standby database 
Fast-Start Failover: DISABLED 
Configuration Status:
SUCCESS