DG目前正常运行在最大可用模式

注:Failover 是失败切换。 这种情况下切换对redo 的处理,就显的很重要。如果处理好,就不会有数据丢失。 否则就会有数据丢失。

在Oracle 11g里,Data Guard 切换多了一个新的功能:flush redo。---我这里未使用。

只要主库能启动到mount 状态,那么Flush 就可以把没有发送的归档和current online redo 发送到备库。

Flush语法:SQL> ALTER SYSTEM FLUSH REDO TO target_db_name;


 这里的target_db_name 是我们在主库的db_unique_name 名称。 也就是在tnsnames.ora 文件配置的。 Flush 会将未发送的redo 从主库传到备库,并且等待redo 在standby 库上apply 之后返回成功。 所以只要Flush成功,那么Failover 就没有数据丢失。

 如果说我们的Primary 已经不能启动到mount 状态,那么就只能按照下面的方法来手动复制相应归档及REDO文件。



环境:

DG1-主库


[oracle@dg1 ~]$ sqlplus / as sysdba


SYS@dg1>select open_mode from v$database;


OPEN_MODE


--------------------


READ WRITE


SYS@dg1>select protection_mode,database_role from v$database;


PROTECTION_MODE      DATABASE_ROLE


-------------------- ----------------


MAXIMUM AVAILABILITY PRIMARY


DG2-备库


SYS@dg2>select protection_mode,database_role from v$database;


PROTECTION_MODE      DATABASE_ROLE


-------------------- ----------------


MAXIMUM AVAILABILITY PHYSICAL STANDBY


SYS@dg2>select open_mode from v$database;


OPEN_MODE


--------------------


MOUNTED



1.在主库切换日志,备库可以正常应用归档。

---模拟failover的话这一步可以不做的,只是证明下DG工作正常。


DG1


SYS@dg1>alter system switch logfile;


System altered.


SYS@dg1>col name for a40


SYS@dg1>select * from (select name,recid from v$archived_log  order by recid desc) where rownum<5;


NAME                                          RECID


---------------------------------------- ----------


/u01/archivelog/arc_1_159_821829622.arc         310


dg2                                             309


                                                308


dg2                                             307


SYS@dg1>select max(sequence#) from v$archived_log;


MAX(SEQUENCE#)


--------------


           159


SYS@dg2>select max(sequence#) from v$archived_log;


MAX(SEQUENCE#)


--------------


           159


SYS@dg2>col name for a40


SYS@dg2>select * from (select name,recid from v$archived_log  order by recid desc) where rownum<5;


NAME                                          RECID


---------------------------------------- ----------


/u01/archivelog/arc_1_159_821829622.arc         166


                                                165


                                                164


                                                163


日志:


[oracle@dg1 ~]$ cat alert_dg.log


Tue Aug 06 09:12:50 2013


LGWR: Standby redo logfile selected to archive thread 1 sequence 160


LGWR: Standby redo logfile selected for thread 1 sequence 160 for destination LOG_ARCHIVE_DEST_2


Thread 1 advanced to log sequence 160 (LGWR switch)


  Current log# 1 seq# 160 mem# 0: /u01/oradata/dg/redo01.log


Tue Aug 06 09:12:51 2013


Archived Log entry 310 added for thread 1 sequence 159 ID 0x6776473b dest 1:


[oracle@dg2 ~]$ cat alert_dg.log


Tue Aug 06 09:12:50 2013


Standby controlfile consistent with primary


Tue Aug 06 09:12:50 2013


Media Recovery Waiting for thread 1 sequence 160 (in transit)


RFS[1]: Selected log 5 for thread 1 sequence 160 dbid 1735160627 branch 821829622


Tue Aug 06 09:12:51 2013


Archived Log entry 166 added for thread 1 sequence 159 ID 0x6776473b dest 1:


Recovery of Online Redo Log: Thread 1 Group 5 Seq 160 Reading mem 0


  Mem# 0: /u01/oradata/dg/standbyredo05.log



2.断开备库DG2的网络,同时在DG1上切换日志来触发归档,然后shutdown abort。

DG1:


SYS@dg1>alter system switch logfile;


System altered.


SYS@dg1>alter system switch logfile;


System altered.


SYS@dg1>select * from (select name,recid from v$archived_log  order by recid desc) where rownum<5;


NAME                                          RECID


---------------------------------------- ----------


/u01/archivelog/arc_1_161_821829622.arc         312


/u01/archivelog/arc_1_160_821829622.arc         311


/u01/archivelog/arc_1_159_821829622.arc         310


dg2                                             309


在表里插入一条数据,测试REDO能否恢复。


SYS@dg1>conn bys/bys


Connected.


BYS@dg1>select * from test;


         A


----------


       999


BYS@dg1>insert into test values(888);


1 row created.


BYS@dg1>commit;


Commit complete.


BYS@dg1>select * from test;


         A


----------


       999


       888


BYS@dg1>conn / as sysdba


Connected.


SYS@dg1>shutdown abort;


ORACLE instance shut down.


日志:

[oracle@dg1 ~]$ cat alert_dg.log


Tue Aug 06 09:12:51 2013


Archived Log entry 310 added for thread 1 sequence 159 ID 0x6776473b dest 1:


Tue Aug 06 09:19:40 2013


ORA-16198: LGWR received timedout error from KSR


LGWR: Attempting destination LOG_ARCHIVE_DEST_2 network reconnect (16198)


LGWR: Destination LOG_ARCHIVE_DEST_2 network reconnect abandoned----从这一句也能判断出是DG2网络有问题


Errors in file /u01/diag/rdbms/dg1/dg/trace/dg_lgwr_6182.trc:


ORA-16198: Timeout incurred on internal channel during remote archival


Error 16198 for archive log file 1 to 'dg2'


Destination LOG_ARCHIVE_DEST_2 is UNSYNCHRONIZED


LGWR: Failed to archive log 1 thread 1 sequence 160 (16198)


Thread 1 advanced to log sequence 161 (LGWR switch)


  Current log# 2 seq# 161 mem# 0: /u01/oradata/dg/redo02.log


Tue Aug 06 09:19:45 2013


Archived Log entry 311 added for thread 1 sequence 160 ID 0x6776473b dest 1:


Tue Aug 06 09:22:27 2013------------------------从这次日志可以判断,161号归档文件已经归档完成。当前使用的是redo03.log,对应的归档日志162号。

Thread 1 advanced to log sequence 162 (LGWR switch)

  Current log# 3 seq# 162 mem# 0: /u01/oradata/dg/redo03.log

Tue Aug 06 09:22:27 2013

Archived Log entry 312 added for thread 1 sequence 161 ID 0x6776473b dest 1:


Tue Aug 06 09:23:43 2013


Shutting down instance (abort)


License high water mark = 11


USER (ospid: 7015): terminating the instance


Instance terminated by USER, pid = 7015


Tue Aug 06 09:23:44 2013

Instance shutdown complete



3.打开备库dg2的网络并查询相关状态

[oracle@dg2 ~]$ sqlplus / as sysdba


SYS@dg2>select max(sequence#) from v$archived_log;


MAX(SEQUENCE#)


--------------


           

159


SYS@dg2>col name for a40


SYS@dg2>select * from (select name,recid from v$archived_log  order by recid desc) where rownum<5;


NAME                                          RECID


---------------------------------------- ----------


/u01/archivelog/arc_1_159_821829622.arc         166


/u01/archivelog/arc_1_158_821829622.arc         165


/u01/archivelog/arc_1_157_821829622.arc         164


/u01/archivelog/arc_1_156_821829622.arc         163



4.从主库将未传到备库的归档及REDO发送过去

[oracle@dg1 ~]$ cd /u01/archivelog/


[oracle@dg1 archivelog]$ ll -t


total 3472


-rw-r----- 1 oracle oinstall   91648 Aug  6 09:22 arc_1_161_821829622.arc


-rw-r----- 1 oracle oinstall 2673152 Aug  6 09:19 arc_1_160_821829622.arc


-rw-r----- 1 oracle oinstall  772608 Aug  6 09:12 arc_1_159_821829622.arc


[oracle@dg1 archivelog]$ cd /u01/oradata/dg


[oracle@dg1 dg]$ ls


control01.ctl  standbyredo04.log  sysaux01.dbf


example01.dbf  standbyredo05.log  system01.dbf


redo01.log     standbyredo06.log  temp01.dbf


redo02.log     standbyredo07.log  undotbs01.dbf


redo03.log     standctl01.ctl     users01.dbf


[oracle@dg1 dg]$

scp redo0* dg2:/u01/archivelog/


oracle@dg2's password:


redo01.log            100%   50MB  10.0MB/s   00:05   


redo02.log            100%   50MB   7.1MB/s   00:07   


redo03.log            100%   50MB   6.3MB/s   00:08   


[oracle@dg1 dg]$ cd /u01/archivelog/


[oracle@dg1 archivelog]$

scp arc_1_160_821829622.arc arc_1_161_821829622.arc  dg2:/u01/archivelog


oracle@dg2's password:


arc_1_160_821829622.a 100% 2611KB   2.6MB/s   00:00   


arc_1_161_821829622.a 100%   90KB  89.5KB/s   00:00  



5.备库上注册复制来的归档日志--解决GAP问题。


注意注意注意:

此时备库打开应用方式如果是alter database recover managed standby database disconnect from session;的话,会自动应用归档日志。

如果是alter database recover managed standby database using current logfile disconnect from session;不会自动应用归档日志。



SYS@dg2>alter database register logfile '/u01/archivelog/arc_1_160_821829622.arc';


Database altered.


SYS@dg2>alter database recover managed standby database cancel;


Database altered.


SYS@dg2>alter database recover managed standby database disconnect from session;


Database altered.


SYS@dg2>alter database register logfile '/u01/archivelog/arc_1_161_821829622.arc';


Database altered.


SYS@dg2>alter database recover managed standby database cancel;


Database altered.


SYS@dg2>select max(sequence#) from v$archived_log;


MAX(SEQUENCE#)


--------------


           

161               -------------------------已经 恢复到162号。


对应的日志:


Tue Aug 06 09:19:43 2013


RFS[1]: Possible network disconnect with primary database


Tue Aug 06 09:31:22 2013--------------------  这里就是成功注册了归档日志,但是没有自动应用。

alter database register logfile '/u01/archivelog/arc_1_160_821829622.arc'

There are 1 logfiles specified.

ALTER DATABASE REGISTER [PHYSICAL] LOGFILE

Completed: alter database register logfile '/u01/archivelog/arc_1_160_821829622.arc'


Tue Aug 06 09:32:19 2013


alter database recover managed standby database cancel


Tue Aug 06 09:32:19 2013


MRP0: Background Media Recovery cancelled with status 16037


Errors in file /u01/diag/rdbms/dg2/dg/trace/dg_mrp0_6370.trc:


ORA-16037: user requested cancel of managed recovery operation

Managed Standby Recovery not using Real Time Apply-----------------------------------

没有自动应用。


Recovery interrupted!


Recovered data files to a consistent state at change 1639340


Errors in file /u01/diag/rdbms/dg2/dg/trace/dg_mrp0_6370.trc:


ORA-16037: user requested cancel of managed recovery operation


MRP0: Background Media Recovery process shutdown (dg)


Waiting for MRP0 pid 6370 to terminate


Managed Standby Recovery Canceled (dg)


Completed: alter database recover managed standby database cancel


T

ue Aug 06 09:32:41 2013

alter database recover managed standby database disconnect from session   -----使用这种方式开始恢复


Attempt to start background Managed Standby Recovery process (dg)


Tue Aug 06 09:32:41 2013


MRP0 started with pid=26, OS id=7093


MRP0: Background Managed Standby Recovery process started (dg)


Serial Media Recovery started


Managed Standby Recovery not using Real Time Apply


Waiting for all non-current ORLs to be archived...


All non-current ORLs have been archived.


Media Recovery Log /u01/archivelog/arc_1_160_821829622.arc


Media Recovery Waiting for thread 1 sequence 161


Completed: alter database recover managed standby database disconnect from session


Tue Aug 06 09:33:14 2013-                                                                                                         -----------可以看到新注册的归档日志自动进行了恢复应用。

alter database register logfile '/u01/archivelog/arc_1_161_821829622.arc'

There are 1 logfiles specified.

ALTER DATABASE REGISTER [PHYSICAL] LOGFILE

Resynchronizing thread 1 from sequence 160 to 161

Completed: alter database register logfile '/u01/archivelog/arc_1_161_821829622.arc'

Tue Aug 06 09:33:18 2013

Media Recovery Log /u01/archivelog/arc_1_161_821829622.arc

Media Recovery Waiting for thread 1 sequence 162



6.使用主库的当前REDO恢复,failover并打开数据库。验证数据是否完整。

SYS@dg2>recover standby database until cancel;


ORA-00279: change 1639530 generated at 08/06/2013 09:22:27 needed for thread 1


ORA-00289: suggestion : /u01/archivelog/arc_1_162_821829622.arc


ORA-00280: change 1639530 for thread 1 is in sequence #162



Specify log: {<RET>=suggested | filename | AUTO | CANCEL}


/u01/archivelog/redo03.log                                                                                       --------------输入日志名(从主库传送来的主库当前REDO),前面DG1 的日志中有提到。


Log applied.


Media recovery complete.


SYS@dg2>alter database

activate standby database;


Database altered.


SYS@dg2>alter database open;


Database altered.


SYS@dg2>select protection_mode,database_role,open_mode from v$database;


PROTECTION_MODE      DATABASE_ROLE    OPEN_MODE


-------------------- ---------------- --------------------


MAXIMUM AVAILABILITY PRIMARY          READ WRITE


查看DG1崩溃前的DML操作,数据无丢失。


SYS@dg2>conn bys/bys


Connected.


BYS@dg2>select * from test;


         A


----------


       999


       888


7.相关日志:


Tue Aug 06 09:36:28 2013

ALTER DATABASE RECOVER  standby database until cancel


Media Recovery Start


Serial Media Recovery started


Managed Standby Recovery not using Real Time Apply


ORA-279 signalled during: ALTER DATABASE RECOVER  standby database until cancel  ...


Tue Aug 06 09:36:55 2013


ALTER DATABASE RECOVER CANCEL


Media Recovery Canceled


Completed: ALTER DATABASE RECOVER CANCEL


Tue Aug 06 09:37:31 2013


ALTER DATABASE RECOVER  standby database until cancel  


Media Recovery Start


Serial Media Recovery started


Managed Standby Recovery not using Real Time Apply


ORA-279 signalled during: ALTER DATABASE RECOVER  standby database until cancel  ...

Tue Aug 06 09:37:51 2013

ALTER DATABASE RECOVER    LOGFILE '/u01/archivelog/redo03.log'  


Media Recovery Log /u01/archivelog/redo03.log


Incomplete recovery applied all redo ever generated.


Recovery completed through change 1639636 time 08/06/2013 09:23:42


Media Recovery Complete (dg)


Completed: ALTER DATABASE RECOVER    LOGFILE '/u01/archivelog/redo03.log'



Tue Aug 06 09:43:40 2013

alter database activate standby database


ALTER DATABASE ACTIVATE [PHYSICAL] STANDBY DATABASE (dg)


krsv_proc_kill: Killing 2 processes (all RFS)


Begin: Standby Redo Logfile archival


End: Standby Redo Logfile archival


RESETLOGS after complete recovery through change 1639636


Resetting resetlogs activation ID 1735804731 (0x6776473b)


Online log /u01/oradata/dg/redo01.log: Thread 1 Group 1 was previously cleared


Online log /u01/oradata/dg/redo02.log: Thread 1 Group 2 was previously cleared


Online log /u01/oradata/dg/redo03.log: Thread 1 Group 3 was previously cleared


Standby became primary SCN: 1639634


Tue Aug 06 09:43:40 2013


Setting recovery target incarnation to 4


ACTIVATE STANDBY: Complete - Database mounted as primary


Completed: alter database activate standby database