最近由于机房停电事件,导致数据库主机异常关闭,主机恢复正常后,启动数据库报错:

ORA-00600: internal error code, arguments: [kcratr_nab_less_than_odr], [1], [8120], [21090], [21205], [], [], [], [], [], [], []


查看alert日志,错误如下:

...省略...

ALTER DATABASE OPEN  #在open阶段报错了

Beginning crash recovery of 1 threads  #在进行crash recovery

 parallel recovery started with 2 processes

Started redo scan

Completed redo scan

 read 2095 KB redo, 98 data blocks need recovery

Errors in file /opt/oracle/diag/rdbms/stgorcl/stgorcl/trace/stgorcl_ora_22775.trc  (incident=89339):

ORA-00600: internal error code, arguments: [kcratr_nab_less_than_odr], [1], [8120], [21090], [21205], [], [], [], [], [], [], []

Incident details in: /opt/oracle/diag/rdbms/stgorcl/stgorcl/incident/incdir_89339/stgorcl_ora_22775_i89339.trc

...省略...


联系主机异常关闭事件,推测可能由于主机异常关闭引起数据库文件被损坏,结合数据库的报错信息看,怀疑是controlfile损坏了。


通过 kcratr_nab_less_than_odr 关键字,在MOS找一篇与此实际相符的文档Alter database open fails with ORA-00600 kcratr_nab_less_than_odr (文档 ID 1296264.1),对比文档与/opt/oracle/diag/rdbms/stgorcl/stgorcl/incident/incdir_89339/stgorcl_ora_22775_i89339.trc 的 "Call Stack Trace",基本确认该事件与文档描述的一致。


----- Call Stack Trace -----

ksedst1 <- ksedst <- dbkedDefDump <- ksedmp <- dbgexPhaseII <- dbgexProcessError <- dbgePostErrorKGE  <- kgeasnmierr <- kcratr_odr_check  <- kcratr <- kctrec <- kcvcrv <- kcfopd <- adbdrv <- opiexe <- opiosq0 <- kpoal8 <- opiodr <- ttcpip <- opitsk <- opiino <- opiodr <- opidrv <- sou2o <- opimai_real <- ssthrdmain <- main <- start


该故障是由于停电故障导致控制文件逻辑corruption造成,文档給出2个解决方案:

Option a: Do cancel based reocvery, and apply 'current online redolog' manually #做recover using backup controlfile

Option b: Recreate the controlfile using the Controlfile recreation script  #重建控制文件


方案a更简单,采用方案a:


找到CURRENT redo,做recover需要用到这个文件。


sys@STGORCL> select a.member, a.group#, b.status from v$logfile a ,v$log b where a.group#=b.group#;


MEMBER                                       GROUP# STATUS

---------------------------------------- ---------- ------------------------------------------------

/opt/oracle/oradata/stgorcl/redo03.log            3 INACTIVE

/opt/oracle/oradata/stgorcl/redo02.log            2 CURRENT

/opt/oracle/oradata/stgorcl/redo01.log            1 INACTIVE



SQL> recover database using backup controlfile until cancel ;

ORA-00279: change 140911009 generated at 04/24/2016 16:12:46 needed for thread

1

ORA-00289: suggestion :

/opt/oracle/product/11.2.0/db_1/dbs/arch1_8120_857011473.dbf  #该redo实际未归档,对应的是/opt/oracle/oradata/stgorcl/redo02.log

ORA-00280: change 140911009 for thread 1 is in sequence #8120



Specify log: {<RET>=suggested | filename | AUTO | CANCEL}

/opt/oracle/oradata/stgorcl/redo02.log

Log applied.

Media recovery complete.



最后使用resetlogs打开数据库


SQL> alter database open resetlogs;


Database altered.

 


注:resetlogs后,需要重新备份,重新搭建DG