问题概述一

9月13日接到用户反馈,查询dba_segments视图报错。

ORA-01578  坏块_ORA-01578

问题原因

1.查看alert日志,发现有ORA-00600和ORA-01578报错。

ORA-01578  坏块_ORA-01578_02

2.分析ORA-00600

根据ora-00600报错信息的函数kqld,查找资料这是内核查询库缓存回调相关的信息,如下截图:

ORA-01578  坏块_ORA-01578_03


再查看相关trace信息,发现错误信息指向的是library cache,需要重启数据库,刷新内存。

ORA-01578  坏块_ORA-01578_04


得到用户授权后,重启数据库,不再报ORA-00600错误

ORA-01578  坏块_ORA-00600_05

3.分析ORA-01578

查询坏块对象,是一个索引

ORA-01578  坏块_ORA-00600_06


得到用户授权后,重建索引

ORA-01578  坏块_ORA-01578_07


查询文件sysaux01.dbf是否还有坏块,没有坏块,至此所有故障恢复。

ORA-01578  坏块_ORA-01578_08

解决方案

1.ORA-00600 [kqldstcpy:in], [0x2B7DB8930], [35]报错:重启数据库刷新内存。
2.ORA-01758坏块报错:重建索引。


问题描述二

一天接到客户通知数据库表数据异常,导致业务影响。

登上服务器查看了alert日志,发现如下报错:

Tue Oct 16 15:30:15 2018
Hex dump of (file 7, block 559965) in trace file /oracle/app/diag/rdbms/smp11g/smp11g2/trace/smp11g2_ora_22687.trc
Corrupt block relative dba: 0x00088b5d (file 7, block 559965)
Fractured block found during buffer read
Data in bad block:
 type: 6 format: 2 rdba: 0x00088b5d
 last change scn: 0x0002.ef5a9dc0 seq: 0x1 flg: 0x06
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x50750601
 check value in block header: 0xbe0b
 computed block checksum: 0xccb4
Reread of rdba: 0x00088b5d (file 7, block 559965) found same corrupted data
Tue Oct 16 15:30:15 2018
Corrupt Block Found
         TSN = 7, TSNAME = OCSTS
         RFN = 1024, BLK = 559965, RDBA = 559965
         OBJN = 213402, OBJD = 213402, OBJECT = IDX_TB_LXSLOG_1, SUBOBJECT = 
         SEGMENT OWNER = BMP_SYS, SEGMENT TYPE = Index Segment
Errors in file /oracle/app/diag/rdbms/smp11g/smp11g2/trace/smp11g2_ora_22687.trc  (incident=7322702):
ORA-01578: ORACLE data block corrupted (file # 7, block # 559965)
ORA-01110: data file 7: '+DG_BMP_DATA/smp11g/datafile/ocsts.256.668614343'
Incident details in: /oracle/app/diag/rdbms/smp11g/smp11g2/incident/incdir_7322702/smp11g2_ora_22687_i7322702.trc

问题分析

由以上日志可以判断为坏块,SEGMENT OWNER = LXS_SYS, SEGMENT TYPE = Index Segment

1、确认坏块对象

SQL> Select tablespace_name,segment_type,owner,segment_name From dba_extents Where file_id=&fid and &blk between block_id and block_id+blocks-1;
Enter value for fid: 7
Enter value for blk: 559965
old   1: Select tablespace_name,segment_type,owner,segment_name From dba_extents Where file_id=&fid and &blk between block_id and block_id+blocks-1
new   1: Select tablespace_name,segment_type,owner,segment_name From dba_extents Where file_id=7 and 559965 between block_id and block_id+blocks-1

TABLESPACE_NAME                SEGMENT_TYPE       OWNER                          SEGMENT_NAME
------------------------------ ------------------ ------------------------------ ---------------------------------------------------------------------------------
OCSTS                          INDEX              BMP_SYS                        IDX_TB_LXSLOG_1

2、重建此索引

SQL> alter index BMP_SYS.IDX_TB_LXSLOG_1 rebuild online;
alter index BMP_SYS.IDX_TB_LXSLOG_1 rebuild online
*
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 7, block # 14724)
ORA-01110: data file 7: '+DG_BMP_DATA/smp11g/datafile/ocsts.256.668614343'

重建过程中提示7号数据文件,14724块也存在坏块,查看此时的alert.log有如下报错:

Wed Oct 17 00:07:02 2018
Hex dump of (file 7, block 14724) in trace file /oracle/app/diag/rdbms/smp11g/smp11g1/trace/smp11g1_ora_20808.trc
Corrupt block relative dba: 0x00003984 (file 7, block 14724)
Fractured block found during buffer read
Data in bad block:
 type: 6 format: 2 rdba: 0x00003984
 last change scn: 0x0002.dbeccdf0 seq: 0x1 flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0xf4550601
 check value in block header: 0x6d85
 computed block checksum: 0x38b5
Reread of rdba: 0x00003984 (file 7, block 14724) found same corrupted data
Wed Oct 17 00:07:02 2018
Corrupt Block Found
         TSN = 7, TSNAME = OCSTS
         RFN = 1024, BLK = 14724, RDBA = 14724
         OBJN = 213400, OBJD = 213400, OBJECT = TB_LXSLOG, SUBOBJECT = 
         SEGMENT OWNER = BMP_SYS, SEGMENT TYPE = Table Segment
Errors in file /oracle/app/diag/rdbms/smp11g/smp11g1/trace/smp11g1_ora_20808.trc  (incident=815693):
ORA-01578: ORACLE data block corrupted (file # 7, block # 14724)
ORA-01110: data file 7: '+DG_BMP_DATA/smp11g/datafile/ocsts.256.668614343'
Incident details in: /oracle/app/diag/rdbms/smp11g/smp11g1/incident/incdir_815693/smp11g1_ora_20808_i815693.trc

3、查看对象文件号

SQL> Select tablespace_name,segment_type,owner,segment_name From dba_extents Where file_id=&fid and &blk between block_id and block_id+blocks-1;
Enter value for fid: 7
Enter value for blk: 14724
old   1: Select tablespace_name,segment_type,owner,segment_name From dba_extents Where file_id=&fid and &blk between block_id and block_id+blocks-1
new   1: Select tablespace_name,segment_type,owner,segment_name From dba_extents Where file_id=7 and 14724 between block_id and block_id+blocks-1

TABLESPACE_NAME  SEGMENT_TYPE       OWNER             SEGMENT_NAME
---------------- ------------------ -------------- -------------------
OCSTS            TABLE              BMP_SYS             TB_LXSLOG


SQL> select header_file,header_block from dba_segments where segment_name='TB_LXSLOG' and owner='BMP_SYS';

HEADER_FILE HEADER_BLOCK
----------- ------------
          7         6339

4、rman检查坏块

RMAN> backup check logical validate datafile 7;

Starting backup at 10/17/2018 10:37:52
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=760 instance=smp11g1 device type=DISK
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00007 name=+DG_BMP_DATA/smp11g/datafile/ocsts.256.668614343
channel ORA_DISK_1: backup set complete, elapsed time: 00:10:05
List of Datafiles
=================
File Status Marked Corrupt Empty Blocks Blocks Examined High SCN
---- ------ -------------- ------------ --------------- ----------
7    FAILED 0              10716575     11520000        12635664718
  File Name: +DG_BMP_DATA/smp11g/datafile/ocsts.256.668614343
  Block Type Blocks Failing Blocks Processed
  ---------- -------------- ----------------
  Data       6              518712          
  Index      4              273477          
  Other      0              11236           

validate found one or more corrupt blocks
See trace file /oracle/app/diag/rdbms/smp11g/smp11g1/trace/smp11g1_ora_4259.trc for details
Finished backup at 10/17/2018 10:47:58

5、查看坏块

SQL> select * from v$database_block_corruption;

     FILE#     BLOCK#     BLOCKS CORRUPTION_CHANGE# CORRUPTIO
---------- ---------- ---------- ------------------ ---------
         7     644649          1                  0 FRACTURED
         7     520540          1                  0 CHECKSUM
         7     469918          1                  0 FRACTURED
         7     467261          1                  0 FRACTURED
         7     449293          1                  0 CHECKSUM
         7     445260          1                  0 FRACTURED
         7      20593          1                  0 CHECKSUM
         7      14724          1                  0 FRACTURED
         7     559965          1                  0 FRACTURED
         7      17853          1                  0 CHECKSUM

10 rows selected.

解决方案

1、表上索引确认并备份索引创建语句

SQL> select owner,index_name,table_owner,table_name from dba_indexes where owner='BMP_SYS' and table_name='TB_LXSLOG';

OWNER                          INDEX_NAME                     TABLE_OWNER                    TABLE_NAME
------------------------------ ------------------------------ ------------------------------ ------------------------------
BMP_SYS                        PK_TB_LXSLOG              BMP_SYS                        TB_LXSLOG
BMP_SYS                        IDX_TB_LXSLOG_1           BMP_SYS                        TB_LXSLOG
BMP_SYS                        IDX_TB_LXSLOG_2           BMP_SYS                        TB_LXSLOG
BMP_SYS                        IDX_TB_LXSLOG_3           BMP_SYS                        TB_LXSLOG
BMP_SYS                        IDX_TB_LXSLOG_4           BMP_SYS                        TB_LXSLOG


SQL> select dbms_metadata.get_ddl('INDEX','PK_TB_LXSLOG','BMP_SYS') from dual;
DBMS_METADATA.GET_DDL('INDEX','PK_TB_LXSLOG','BMP_SYS')
--------------------------------------------------------------------------------

  CREATE UNIQUE INDEX "BMP_SYS"."PK_TB_LXSLOG" ON "BMP_SYS"."TB_LXSLOG" ("LOGKEY")
  PCTFREE 10 INITRANS 2 MAXTRANS 255 COMPUTE STATISTICS
  STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
  PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT)
  TABLESPACE "OCSTS"

SQL> select dbms_metadata.get_ddl('INDEX','IDX_TB_LXSLOG_1','BMP_SYS') from dual;

DBMS_METADATA.GET_DDL('INDEX','IDX_TB_LXSLOG_1','BMP_SYS')
--------------------------------------------------------------------------------

  CREATE INDEX "BMP_SYS"."IDX_TB_LXSLOG_1" ON "BMP_SYS"."TB_LXSLOG" ("SERVICENUMBER")
  PCTFREE 10 INITRANS 2 MAXTRANS 255 COMPU
TE STATISTICS
  STORAGE(INITIAL 65536 NEXT 1048576 MINEXTEN
TS 1 MAXEXTENTS 2147483645
  PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_
POOL DEFAULT)
  TABLESPACE "OCSTS"
  
SQL> select dbms_metadata.get_ddl('INDEX','IDX_TB_LXSLOG_2','BMP_SYS') from dual;


DBMS_METADATA.GET_DDL('INDEX','IDX_TB_LXSLOG_2','BMP_SYS')
--------------------------------------------------------------------------------

  CREATE INDEX "BMP_SYS"."IDX_TB_LXSLOG_2" ON "BMP_SYS"."TB_LXSLOG" ("INORDERID", "COMMANDID", "PROCESSTAG")
  PCTFREE 10 INITRANS 2 MAXTRANS 255 COMPUTE STATISTICS
  STORAGE(INITIAL 65536 NEXT 1048576 MINEX
TENTS 1 MAXEXTENTS 2147483645
  PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFF
ER_POOL DEFAULT)
  TABLESPACE "OCSTS"
  
SQL> select dbms_metadata.get_ddl('INDEX','IDX_TB_LXSLOG_3','BMP_SYS') from dual;


DBMS_METADATA.GET_DDL('INDEX','IDX_TB_LXSLOG_3','BMP_SYS')
--------------------------------------------------------------------------------

  CREATE INDEX "BMP_SYS"."IDX_TB_LXSLOG_3" ON "BMP_SYS"."TB_LXSLOG" ("SUBSCRIBERKEY", "TRADETYPE", "PROCESSTAG")
  PCTFREE 10 INITRANS 2 MAXTRANS 255 COMPUTE STATISTICS
  STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
  PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1
BUFFER_POOL DEFAULT)
  TABLESPACE "OCSTS"

SQL> select dbms_metadata.get_ddl('INDEX','IDX_TB_LXSLOG_4','BMP_SYS') from dual;


DBMS_METADATA.GET_DDL('INDEX','IDX_TB_LXSLOG_4','BMP_SYS')
--------------------------------------------------------------------------------

  CREATE INDEX "BMP_SYS"."IDX_TB_LXSLOG_4" ON "BMP_SYS"."TB_LXSLOG" ("ENDDATE", "PROCESSTAG")
  PCTFREE 10 INITRANS 2 MAXTRANS 255 COMPUTE STATISTICS
  STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTE
NTS 2147483645
  PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1
 BUFFER_POOL DEFAULT)
  TABLESPACE "OCSTS"

2、表上触发器确认

SQL> select owner,trigger_name,table_owner,table_name from dba_triggers where table_owner='BMP_SYS' and table_name='TB_LXSLOG';

3、开启数据库10231事件,跳过坏块数据

SQL> ALTER SESSION SET EVENTS '10231 TRACE NAME CONTEXT FOREVER, LEVEL 10';

4、创建临时表存放数据

SQL> CREATE TABLE BMP_SYS.TB_LXSLOG_tmp AS SELECT * FROM BMP_SYS.TB_LXSLOG;

5、关闭数据库10231事件

SQL> ALTER SESSION SET EVENTS '10231 TRACE NAME CONTEXT OFF';

6、处理问题表

本次采用rename的方式处理

SQL> alter table LXS_SYS.TB_LXSLOG rename to TB_LXSLOG_20181017;

7、修改临时表为正式表

SQL> alter table BMP_SYS.TB_LXSLOG_tmp rename to TB_LXSLOG;

8、删除问题表上的索引

SQL> alter table BMP_SYS.TB_LXSLOG_20181017 drop constraint PK_TB_LXSLOG cascade drop index;
SQL> drop index BMP_SYS.IDX_TB_LXSLOG_1;
SQL> drop index BMP_SYS.IDX_TB_LXSLOG_2;
SQL> drop index BMP_SYS.IDX_TB_LXSLOG_3;
SQL> drop index BMP_SYS.IDX_TB_LXSLOG_4;

9、新表创建索引

SQL> CREATE UNIQUE INDEX BMP_SYS.PK_TB_LXSLOG ON BMP_SYS.TB_LXSLOG (LOGKEY) TABLESPACE OCSTS;
SQL> CREATE INDEX "BMP_SYS"."IDX_TB_LXSLOG_1" ON "BMP_SYS"."TB_LXSLOG" ("SERVICENUMBER") TABLESPACE OCSTS;
SQL> CREATE INDEX "BMP_SYS"."IDX_TB_LXSLOG_2" ON "BMP_SYS"."TB_LXSLOG" ("INORDERID", "COMMANDID", "PROCESSTAG") TABLESPACE OCSTS;
SQL> CREATE INDEX "BMP_SYS"."IDX_TB_LXSLOG_3" ON "BMP_SYS"."TB_LXSLOG" ("SUBSCRIBERKEY", "TRADETYPE", "PROCESSTAG") TABLESPACE OCSTS;
SQL> CREATE INDEX "BMP_SYS"."IDX_TB_LXSLOG_4" ON "BMP_SYS"."TB_LXSLOG" ("ENDDATE", "PROCESSTAG") TABLESPACE OCSTS;

对比数据

SQL> select count(1) from BMP_SYS.TB_LXSLOG;
  COUNT(1)
----------
    898594

SQL> select count(1) from BMP_SYS.TB_LXSLOG_20181017;
  COUNT(1)
----------
    898623

10、删除问题表

应用正式恢复之后无影响,可以删除问题表

SQL> drop table BMP_SYS.TB_LXSLOG_20181017;