不完全数据库恢复

到目前为止,前面讨论的都是完全恢复数据库,这是recover database\recover tablespace\recover datafile默认的行为特征。

所谓完全恢复指在介质恢复阶段(执行recover)必须应用所有的应该应用的重做日志:它们包括从数据文件头中的检查点RBA所指向的重做记录开始,

一致到在线的当前的CURRENT redo log结束,不论在归档文件还是redo中,recover将尽可能寻找这些记录,最后返回

media recovery complete

前面章节讨论没有丢失任何一个归档跟redo日志,所以都是完全恢复总是成功的。

不完全恢复在介质恢复阶段不需要应用所有的重做记录,而只需要一部分:介质恢复的起点必须是数据文件头中的检查点RBA所指向的重做记录,终点可以

是之后的任意一条重做记录,这些重做记录可以在归档日志或是在线日志中。不完全恢复的结果是将数据库恢复到“过去”的一个时间点,而不像完全恢复那样将db恢复到当前时间点。

完全恢复不会丢失任何一个已经提交的事务,而不完全恢复可能丢失已提交的事务。

12.1 工作原理

不完全恢复的结果是一样的,2个特点:

--1 不完全恢复极可能造成已经提交的事务丢失,即将db回退到过去的一个时间点,导致某些变更就像根本没有发生过一样。因此,不完全恢复也被称为指定时间目标(point in time)恢复,PIT恢复

--2 不完全恢复的影响范围一定是数据库范围的,即只有recover database命令支持不完全恢复。

由于极可能造成事务丢失,并且影响整个数据库,绝非必要,否则不应该使用不完全恢复解决问题。

主动与被动不完全恢复

--被动恢复:原计划完全恢复,但是在恢复过程中发现recover需要的一个归档日志或处于active或current的redo log丢失或损坏,结果被迫以不完全恢复,否则db无法打开。

--主动恢复:由于人为错误破坏了数据库内的大量信息(比如update,truncate重要的表),为了找回正确的信息使数据库回到破坏操作之前采用的不完全恢复。

PIT的4种表达形式

--1 精确到重做日志的基于日志序列号recover database until sequence,仅适用RMAN

比如,RECOVER DATABASE UNTIL SEQUENCE 100视图将db恢复至99号日志的最后一条重做记录,不包括100号日志。

--2 精确到重做日志的基于交互式手动喊停的recover database until cancel,仅适用sqlplus

--3 精确到重做记录的基于时间的recover database until time,适用于RMAN和sqlplus。

比如,recover database until time ‘sysdate-1’ 试图将db恢复至大于24小时之前

--4 精确到重做记录的基于SCN的recover database until scn和recover database until restore point,适用于RMAN;recover database until change 和recover database until restore point适用于sqlplus。

比如recover database until scn 10000试图将db恢复至PIT点SCN10000

为了使不完全恢复能够成功,在recover执行之前的restore命令有一个非常重要的注意事项,那就是必须还原足够“旧”的备份中的数据文件。因为recover命令使用的是重做记录,重做记录只能将数据向“前”推,增大其头部的检查点SCN。

假设不完全恢复PIT点的SCN是70000,但是restore命令还原的数据文件头部中的SCN是80000,这会导致recover database until scn 70000抛出RMAN-06556异常,意思是数据文件不够“旧”,

即restore命令还原的数据文件头SCN不够小,其值应该小于或等于70000,recover命令才能成功。

12.2 被动恢复

场景1数据库有备份,运行时数据文件yhqt01.dbf丢失,并且某些在备份后产生的未备份的归档日志丢失,应用报错ORA-01116无法打开10号文件

丢失的是普通数据文件,所以



RMAN> run {
sql'alter database datafile 10 offline';
restore datafile 10;
recover datafile 10;
sql'alter database datafile 10 online';}
Starting recover at 25-JUL-19
using channel ORA_DISK_1
starting media recovery
archived log for thread 1 with sequence 84 is already on disk as file /u01/app/oracle/fast_recovery_area/ORCL/archivelog/2019_07_25/o1_mf_1_84_gml7nj0y_.arc
archived log for thread 1 with sequence 85 is already on disk as file /u01/app/oracle/fast_recovery_area/ORCL/archivelog/2019_07_25/o1_mf_1_85_gml7nk2o_.arc
archived log for thread 1 with sequence 88 is already on disk as file /u01/app/oracle/fast_recovery_area/ORCL/archivelog/2019_07_25/o1_mf_1_88_gml7o84w_.arc
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of recover command at 07/25/2019 11:26:00
RMAN-06053: unable to perform media recovery because of missing log
RMAN-06025: no backup of archived log for thread 1 with sequence 87 and starting SCN of 1013521 found to restore
RMAN-06025: no backup of archived log for thread 1 with sequence 86 and starting SCN of 1013510 found to restore



--由于手工删除了86.87号归档日志

查看当前在线日志的情况



SQL> select group#,sequence#,status from v$log;
    GROUP#  SEQUENCE# STATUS
---------- ---------- ----------------
     1       88 INACTIVE
     2       89 CURRENT
     3       87 INACTIVE



查看被还原的10号文件yhqt01.dbf头部的SCN号,该数据文件头部的SCN即使recover命令恢复的起点



SQL> select file#,change# from v$recover_file;
     FILE#    CHANGE#
---------- ----------
    10    1006178



使用SCN号是1006178去咨询一下v$log_history视图,得到recover datafile 10的起点在哪个日志



SQL> select sequence# from v$log_history where 1006178 between first_change# and next_change#-1;
 SEQUENCE#
----------
    82



结果是82号日志

根据上面3个查询,刚才使用的recover datafile 10,需要从82号日志恢复到89号日志,其中88,89为在线日志,RMAN报告86,87号归档日志找不到,既然不能成功地从82号一直恢复到89号,recover命令异常。

接下来去归档路径下找一找,发现确实找不到,这里人为的rm掉了



[oracle@DSI 2019_07_25]$ ll
total 5148
-rw-r----- 1 oracle oinstall 5262336 Jul 25 11:14 o1_mf_1_84_gml7nj0y_.arc
-rw-r----- 1 oracle oinstall    2048 Jul 25 11:14 o1_mf_1_85_gml7nk2o_.arc
-rw-r----- 1 oracle oinstall    2560 Jul 25 11:14 o1_mf_1_88_gml7o84w_.arc



再使用RMAN查看归档备份的情况



RMAN> list backup of archivelog all;



那么oracle认为86 87号归档日志在哪里呢



RMAN> list archivelog low sequence 86 high sequence 87;
List of Archived Log Copies for database with db_unique_name ORCL
=====================================================================
Key     Thrd Seq     S Low Time 
------- ---- ------- - ---------
11      1    86      A 25-JUL-19
        Name: /u01/app/oracle/fast_recovery_area/ORCL/archivelog/2019_07_25/o1_mf_1_86_gml7o58q_.arc
12      1    87      A 25-JUL-19
        Name: /u01/app/oracle/fast_recovery_area/ORCL/archivelog/2019_07_25/o1_mf_1_87_gml7o64p_.arc



这种情况下,似乎只能将yhqt01.dbf文件恢复到85号日志的最后一条重做记录,



RMAN> recover datafile 10 until sequence 86;



命令执行成功。但是不要误认为oracle可以支持单独的一个数据文件进行不完全恢复

上线10号文件报错



RMAN> sql'alter database datafile 10 online';
sql statement: alter database datafile 10 online
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03009: failure of sql command on default channel at 07/25/2019 11:36:40
RMAN-11003: failure during parse/execution of SQL statement: alter database datafile 10 online
ORA-01113: file 10 needs media recovery
ORA-01110: data file 10: '/u01/app/oracle/oradata/orcl/yhqt01.dbf'



错误RMAN-11003表示yhqt01.dbf文件和其他数据文件无法同步,还需要更多恢复,需要86号日志,但是86号日志已经丢失找不到了。这个时候,

有两种办法,要么在也不上线yhqt,不访问yhqt01.dbf数据文件。要么在数据库范围上执行一次真正的不完全恢复,使所有数据文件倒退至85号文件的最后一条重做日志,保证所有数据文件之间的一致性。



RMAN> recover database until sequence 86;
Starting recover at 25-JUL-19
using channel ORA_DISK_1
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of recover command at 07/25/2019 11:40:03
RMAN-06556: datafile 1 must be restored from backup older than SCN 1013510



直接执行会报错,因为其他数据文件不够旧

关闭数据库



SQL> shutdown abort;
ORACLE instance shut down.



RMAN> run {
startup force mount;
restore database;
}2> 3> 4> 
Oracle instance started
database mounted
Total System Global Area     784998400 bytes
Fixed Size                     2257352 bytes
Variable Size                499125816 bytes
Database Buffers             276824064 bytes
Redo Buffers                   6791168 bytes
Starting restore at 25-JUL-19
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=133 device type=DISK
channel ORA_DISK_1: starting datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_DISK_1: restoring datafile 00001 to /u01/app/oracle/oradata/orcl/system01.dbf
channel ORA_DISK_1: restoring datafile 00002 to /u01/app/oracle/oradata/orcl/sysaux01.dbf
channel ORA_DISK_1: restoring datafile 00004 to /u01/app/oracle/oradata/orcl/users01.dbf
channel ORA_DISK_1: restoring datafile 00005 to /u01/app/oracle/oradata/orcl/test01.dbf
channel ORA_DISK_1: restoring datafile 00008 to /u01/app/oracle/oradata/orcl/rc_data01.dbf
channel ORA_DISK_1: reading from backup piece /home/oracle/backup/bol_fullbak01u7hht8_1_1_20190725
channel ORA_DISK_1: piece handle=/home/oracle/backup/bol_fullbak01u7hht8_1_1_20190725 tag=BOL_FULLBAK
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:00:07
channel ORA_DISK_1: starting datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_DISK_1: restoring datafile 00003 to /u01/app/oracle/oradata/orcl/undotbs01.dbf
channel ORA_DISK_1: restoring datafile 00006 to /u01/app/oracle/oradata/orcl/assm01.dbf
channel ORA_DISK_1: restoring datafile 00007 to /u01/app/oracle/oradata/orcl/mssm01.dbf
channel ORA_DISK_1: restoring datafile 00009 to /u01/app/oracle/oradata/orcl/ogg01.dbf
channel ORA_DISK_1: restoring datafile 00010 to /u01/app/oracle/oradata/orcl/yhqt01.dbf
channel ORA_DISK_1: reading from backup piece /home/oracle/backup/bol_fullbak02u7hhtf_1_1_20190725
channel ORA_DISK_1: piece handle=/home/oracle/backup/bol_fullbak02u7hhtf_1_1_20190725 tag=BOL_FULLBAK
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:00:07
Finished restore at 25-JUL-19



利用动态视图检查所还原的数据文件是否足够“旧”但又不至于“太旧”



SQL> select 'restore check ok' "restore_check" 
from 
(select max(next_change#-1) tail,min(first_change#) head 
from v$log_history
where 
sequence# between 82 and 86) lh
where 
(select max(change#) from v$recover_file) <= lh.tail  and
(select min(change#) from   2    3    4    5    6    7    8    9  v$recover_file) >= lh.head;
restore_check
----------------
restore check ok



restore database默认从最新的备份中还原,在82号日志之后是备份过数据库,并且没有在85号日志之后备份过数据库。

实际情况应该在restore中给出from关键字指定哪个备份集还原



RMAN> run {
2> set until sequence 86;
3> restore database;
4> }
executing command: SET until clause
Starting restore at 25-JUL-19
using channel ORA_DISK_1
skipping datafile 1; already restored to file /u01/app/oracle/oradata/orcl/system01.dbf
skipping datafile 2; already restored to file /u01/app/oracle/oradata/orcl/sysaux01.dbf
skipping datafile 4; already restored to file /u01/app/oracle/oradata/orcl/users01.dbf
skipping datafile 5; already restored to file /u01/app/oracle/oradata/orcl/test01.dbf
skipping datafile 8; already restored to file /u01/app/oracle/oradata/orcl/rc_data01.dbf
skipping datafile 3; already restored to file /u01/app/oracle/oradata/orcl/undotbs01.dbf
skipping datafile 6; already restored to file /u01/app/oracle/oradata/orcl/assm01.dbf
skipping datafile 7; already restored to file /u01/app/oracle/oradata/orcl/mssm01.dbf
skipping datafile 9; already restored to file /u01/app/oracle/oradata/orcl/ogg01.dbf
skipping datafile 10; already restored to file /u01/app/oracle/oradata/orcl/yhqt01.dbf
restore not done; all files read only, offline, or already restored
Finished restore at 25-JUL-19



在restore前使用set until可以有效防止还原不够“旧”的备份。这里还原的数据文件头部的SCN必须小于86号日志的第一条重做记录的SCN



RMAN> recover database until sequence 86;
Starting recover at 25-JUL-19
using channel ORA_DISK_1
starting media recovery
archived log for thread 1 with sequence 84 is already on disk as file /u01/app/oracle/fast_recovery_area/ORCL/archivelog/2019_07_25/o1_mf_1_84_gml7nj0y_.arc
archived log for thread 1 with sequence 85 is already on disk as file /u01/app/oracle/fast_recovery_area/ORCL/archivelog/2019_07_25/o1_mf_1_85_gml7nk2o_.arc
channel ORA_DISK_1: starting archived log restore to default destination
channel ORA_DISK_1: restoring archived log
archived log thread=1 sequence=82
channel ORA_DISK_1: restoring archived log
archived log thread=1 sequence=83
channel ORA_DISK_1: reading from backup piece /home/oracle/backup/arch_06u7hhtm_1_1_20190725
channel ORA_DISK_1: piece handle=/home/oracle/backup/arch_06u7hhtm_1_1_20190725 tag=TAG20190725T110004
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:00:01
archived log file name=/u01/app/oracle/fast_recovery_area/ORCL/archivelog/2019_07_25/o1_mf_1_82_gml9gk3o_.arc thread=1 sequence=82
channel default: deleting archived log(s)
archived log file name=/u01/app/oracle/fast_recovery_area/ORCL/archivelog/2019_07_25/o1_mf_1_82_gml9gk3o_.arc RECID=16 STAMP=1014551121
archived log file name=/u01/app/oracle/fast_recovery_area/ORCL/archivelog/2019_07_25/o1_mf_1_83_gml9gk46_.arc thread=1 sequence=83
channel default: deleting archived log(s)
archived log file name=/u01/app/oracle/fast_recovery_area/ORCL/archivelog/2019_07_25/o1_mf_1_83_gml9gk46_.arc RECID=17 STAMP=1014551121
archived log file name=/u01/app/oracle/fast_recovery_area/ORCL/archivelog/2019_07_25/o1_mf_1_84_gml7nj0y_.arc thread=1 sequence=84
archived log file name=/u01/app/oracle/fast_recovery_area/ORCL/archivelog/2019_07_25/o1_mf_1_85_gml7nk2o_.arc thread=1 sequence=85
media recovery complete, elapsed time: 00:00:00
Finished recover at 25-JUL-19



以上的命令也能这样执行

RMAN> run{

set until sequence 86;

recover database;

}

至此,从86号日志第一条重做开始其后的所有重做记录描述的变更均未生效,数据文件中没有这些结果的变更。



SQL> alter database open resetlogs;
Database altered.
SQL> select sequence#,status from v$log;
 SEQUENCE# STATUS
---------- ----------------
     1 CURRENT
     0 UNUSED
     0 UNUSED



整个事故到此尚未结束,对业务来说需要将85号日志以后提交的事务重新录入数据库。

上面的步骤总体可以总结为:



RMAN> run {
startup force mount;
set until sequence 86;
restore database;
recover database;
alter database open resetlogs;
}



另外,若在RAC环境下,为了避免PIT点产生歧义,在set until时指定线程号



RMAN> run {
startup force mount;
set until sequence 86 thread 1;
restore database;
recover database;
alter database open resetlogs;
}



 

话外篇

此时查询yhqt依然报错



SQL> conn yhqt/****
Connected.
SQL> select count(*) from yhqtest_3;
select count(*) from yhqtest_3
                    *
ERROR at line 1:
ORA-00376: file 10 cannot be read at this time
ORA-01110: data file 10: '/u01/app/oracle/oradata/orcl/yhqt01.dbf'
SQL> select file#||chr(9)||status||chr(9)||name||chr(9)||bytes from v$datafile where file#=10;

FILE#||CHR(9)||STATUS||CHR(9)||NAME||CHR(9)||BYTES
--------------------------------------------------------------------------------
10    OFFLINE    /u01/app/oracle/oradata/orcl/yhqt01.dbf    52428800
SQL> alter database datafile 10 online;
alter database datafile 10 online
*
ERROR at line 1:
ORA-01190: control file or data file 10 is from before the last RESETLOGS
ORA-01110: data file 10: '/u01/app/oracle/oradata/orcl/yhqt01.dbf'
SQL> set linesize 999



查看相关的检查点



--数据文件检查点
SQL> select file#,CREATION_CHANGE#,CHECKPOINT_CHANGE# ,UNRECOVERABLE_CHANGE#,LAST_CHANGE#,OFFLINE_CHANGE#,STATUS from v$datafile order by 1; 
     FILE# CREATION_CHANGE# CHECKPOINT_CHANGE# UNRECOVERABLE_CHANGE# LAST_CHANGE# OFFLINE_CHANGE# STATUS
---------- ---------------- ------------------ --------------------- ------------ --------------- -------
     1          7           1013532               0              1013510 SYSTEM
     2           1786           1013532               0              1013510 ONLINE
     3           2807           1013532               0              1013510 ONLINE
     4          15626           1013532               0              1013510 ONLINE
     5         923171           1013532               0              1013510 ONLINE
     6         923455           1013532               0              1013510 ONLINE
     7         923728           1013532               0              1013510 ONLINE
     8         924001           1013532               0              1013510 ONLINE
     9         924272           1013532               0              1013510 ONLINE
    10         986669           1013531               0      1013532      1013510 OFFLINE
--数据文件头检查点    
SQL> select checkpoint_change# from v$datafile_header;
CHECKPOINT_CHANGE#
------------------
       1013532
       1013532
       1013532
       1013532
       1013532
       1013532
       1013532
       1013532
       1013532
       1013510
--系统检查点       
SQL> select checkpoint_change# from v$database;
CHECKPOINT_CHANGE#
------------------
       1013532



查看incrarnation



RMAN> list incarnation;
List of Database Incarnations
DB Key  Inc Key DB Name  DB ID            STATUS  Reset SCN  Reset Time
------- ------- -------- ---------------- --- ---------- ----------
1       1       ORCL     1542322764       PARENT  1          25-JUL-19
2       2       ORCL     1542322764       CURRENT 1013511    25-JUL-19



10号数据文件的头部检查点还在reset scn的前面

恢复10号文件



SQL> recover datafile 10;
Media recovery complete.
SQL> select checkpoint_change# from v$datafile_header;
CHECKPOINT_CHANGE#
------------------
       1013532
       1013532
       1013532
       1013532
       1013532
       1013532
       1013532
       1013532
       1013532
       1013532
10 rows selected.
SQL> alter database datafile 10 online;
Database altered.
SQL> select count(*) from yhqtest_3;
  COUNT(*)
----------
      5001