The controlfile header block returned by the OS has a sequence number that is too old
原创
©著作权归作者所有:来自51CTO博客作者Liujun_Deng的原创作品,请联系作者获取转载授权,否则将追究法律责任
问题描述:客户反馈某套数据库近期出现自动重启的现象,重启后又能正常使用.查trace日志发现存在以下告警,如图所示:
数据库:oracle 19.3
环境:虚拟机中的服务器.
提示很明显,操作系统返回的控制文件头块序列号太老,控制文件有可能损坏造成.
Mos 资料Doc ID 1589355.1记录如下:
SYMPTOMS
Database instance went down with following error message in alert log:
---
Wed Sep 11 23:26:39 2013
********************* ATTENTION: ********************
The controlfile header block returned by the OS
has a sequence number that is too old.
The controlfile might be corrupted.
PLEASE DO NOT ATTEMPT TO START UP THE INSTANCE
without following the steps below.
RE-STARTING THE INSTANCE CAN CAUSE SERIOUS DAMAGE
TO THE DATABASE, if the controlfile is truly corrupted.
In order to re-start the instance safely,
please do the following:
(1) Save all copies of the controlfile for later
analysis and contact your OS vendor and Oracle support.
(2) Mount the instance and issue:
ALTER DATABASE BACKUP CONTROLFILE TO TRACE;
(3) Unmount the instance.
(4) Use the script in the trace file to
RE-CREATE THE CONTROLFILE and open the database.
*****************************************************
USER (ospid: 24051722): terminating the instance
---
CAUSE
BUG 14281768 - CONTROL FILE GETS CORRUPTED
Which was closed as Vendor OS/Software/Framework Problem
SOLUTION
Error is typically raised when the Controlfile is overwritten by an older copy of the Controlfile. Most likely this happened due to Storage OR I/o error.
All copies of the control file must have the same internal sequence number for Oracle to start up the database or shut it down in normal or immediate mode.
The solution is actually given with the accompained message :-
(1) Save all copies of the controlfile for later
analysis and contact your OS vendor and Oracle support.
(2) Mount the instance and issue:
ALTER DATABASE BACKUP CONTROLFILE TO TRACE;
(3) Unmount the instance.
(4) Use the script in the trace file to
RE-CREATE THE CONTROLFILE and open the database.
To make a sanity check in the future , please set the following parameter :-
SQL> alter system set "_controlfile_update_check"='HIGH' scope=spfile; -- then bounce the database.
Please check with your OS System/Storage admin regarding the issue.
The precautions is to relocate the control file on a fast and direct I/O enabled disk , the main target is not letting the OS to write an old copy (cached copy of the controlfile to it).
To reverse the parameter setting :-
SQL> alter system set "_controlfile_update_check"='OFF' scope=spfile; -- then bounce the database.
总结:造成该异常的本质原因是IO性能太差,导致控制文件书写不同步,内部sequence不一致,此时数据库就会认为控制文件损坏.官方建议调整隐含参数,但根本还是提高IO性能,否则不排除出现真正的控制文件损坏.由于暂时无法解决存储端问题,此处修改隐藏参数_controlfile_update_check,以后如果解决了IO低效问题,可以再将该参数设置为默认值.
说明:_controlfile_update_check ---该参数控制controlfile损坏后是否进行检查.如果检查发现某个controlfile异常,数据库会crash.