故障处理

一、故障现象
三张参数表数据丢失,一张一个字段变NULL
 
二、应急处理
1、 DG备库关闭并且启动到只读模式
2、 从DG库EXP相关表并且IMP到生产库(先EXP备份生产库相关表后DRPP掉相关表)
3、 确认数据已经恢复,处理过程5分钟
 
三、思考
其他处理方法:
1、10G的flashback功能
SQL> flashback table t to timestamp to_timestamp('2011-01-04 16:00:00','YYYY-MM-DD HH24:MI:SS');
                *
第 1 行出现错误:
ORA-08189: 因为未启用行移动功能, 不能闪回表
 
SQL> alter table t enable row movement;
表已更改。
SQL> flashback table t to timestamp to_timestamp('2011-01-04 16:00:00','YYYY-MM-DD HH24:MI:SS');
 
2、logmnr分析归档日志的UNDO事件,获取INSERT和UPDATE语句
 
后续分析
一、从数据库视图角度分析
1、查询v$open_cursor视图,初步定位删除的语句:
select * from v$open_cursor Where sql_text Like 'delete from manage_%'

  
SADDR
SID
USER_NAME
ADDRESS
HASH_VALUE
SQL_ID
SQL_TEXT
1
0700000309706FF0
1171
OPENDTEST
070000027DCDD4F0
3705303229
53u5x1zfdnw5x
delete from manage_role_resource where RESOURCE_ID=:1
2
0700000309706FF0
1171
OPENDTEST
070000027BD82370
425725875
182dasccq03xm
delete from manage_resource where ID=:1
3
0700000309706FF0
1171
OPENDTEST
070000027CBD8BE0
2636928772
dpmh3akfksps4
delete from manage_resourcetype where ID=:1
4
070000030A7438F8
1294
OPENDTEST
070000027CBD8BE0
2636928772
dpmh3akfksps4
delete from manage_resourcetype where ID=:1
5
070000030971F140
1243
OPENDTEST
070000027DCDD4F0
3705303229
53u5x1zfdnw5x
delete from manage_role_resource where RESOURCE_ID=:1
6
070000030971F140
1243
OPENDTEST
070000027BD82370
425725875
182dasccq03xm
delete from manage_resource where ID=:1
7
070000030971F140
1243
OPENDTEST
070000027CBD8BE0
2636928772
dpmh3akfksps4
delete from manage_resourcetype where ID=:1
8
070000030A7A28D0
1578
OPENDTEST
070000027DCDD4F0
3705303229
53u5x1zfdnw5x
delete from manage_role_resource where RESOURCE_ID=:1
9
070000030A7A28D0
1578
OPENDTEST
070000027BD82370
425725875
182dasccq03xm
delete from manage_resource where ID=:1
10
070000030A7A28D0
1578
OPENDTEST
070000027CBD8BE0
2636928772
dpmh3akfksps4
delete from manage_resourcetype where ID=:1
11
070000030F72FA90
1324
OPENDTEST
070000027DCDD4F0
3705303229
53u5x1zfdnw5x
delete from manage_role_resource where RESOURCE_ID=:1
12
070000030F72FA90
1324
OPENDTEST
070000027BD82370
425725875
182dasccq03xm
delete from manage_resource where ID=:1
13
070000030F72FA90
1324
OPENDTEST
070000027CBD8BE0
2636928772
dpmh3akfksps4
delete from manage_resourcetype where ID=:1
 
 
 
 
 
 
 
 

2、根据如下查询得到的SID,查询会话视图,可以看出SQL语句来源于的机器和操作系统用户。
 
 
二、从归档日志角度分析
1、分析一过程中我们已经获悉DELETE操作是在节点1上完成的,我们查询该时段的归档日志信息
$ cd /home/oracle/archlog/rac1
$ ls -ltra
total 657152
drwxr-xr-x   7 oracle   dba             256 Oct 09 17:46 ..
-rw-rw----   1 oracle   dba            1024 Oct 10 04:54 2_15888_614088933.arc
-rw-rw----   1 oracle   dba            1024 Oct 10 05:59 2_15890_614088933.arc
-rw-rw----   1 oracle   dba            1024 Oct 10 06:01 2_15892_614088933.arc
-rw-r-----   1 oracle   dba         3635200 Jan 03 23:42 1_24921_614088933.arc
-rw-r-----   1 oracle   dba       101102080 Jan 04 08:28 1_24922_614088933.arc
-rw-r-----   1 oracle   dba        51537408 Jan 04 10:40 1_24923_614088933.arc
-rw-r-----   1 oracle   dba        81930240 Jan 04 15:09 1_24924_614088933.arc
-rw-r-----   1 oracle   dba        98228736 Jan 04 18:16 1_24925_614088933.arc
分析:因为故障发现在16点-17点之间,我们要重点分析1_24925_614088933.arc
 
2、设定LINE和TIME
SET LIN 200
SET TIME ON
 
3、指定LogMiner要分析的重做日志文件
begin
dbms_logmnr.add_logfile(LogFileName=>'/home/oracle/archlog/rac1/1_24924_614088933.arc', options => dbms_logmnr.new);
dbms_logmnr.add_logfile(LogFileName=>'/home/oracle/archlog/rac1/1_24925_614088933.arc');
end;
/
 
4、启动LogMiner会话
BEGIN
 SYS.DBMS_LOGMNR.start_logmnr
 (options => SYS.DBMS_LOGMNR.dict_from_online_catalog);
END;
/
 
5、alter session set nlS_datE_format='YYYYMMDD HH24:MI:SS';
 
6、显示用于分析的日志列表信息
SELECT LOW_TIME,HIGH_TIME,LOW_SCN ,NEXT_SCN FROM v$logmnr_logs;
 
LOW_TIME          HIGH_TIME            LOW_SCN   NEXT_SCN
----------------- ----------------- ---------- ----------
20110104 10:40:19 20110104 15:09:04 2903836753 2904342053
20110104 15:09:04 20110104 18:16:10 2904342053 2904789833
 
7、分析信息提取转储到临时表
create table tmp_logmnr11
Tablespace RAC_DATA
as
SELECT thread#,data_obj#,timestamp,seg_name,table_name,seg_owner,OPENDTEST,sql_redo,sql_undo
FROM v$logmnr_contents;
 
8、因为在应急恢复过程中对数据库6张表进行了删除重导,所以在tmp_logmnr11中提取信息时要进行表明“翻译”
----------------
--查出回收站表名
----------------
Select * From Dba_Recyclebin N
Where N.ORIGINAL_NAME='MANAGE_MENU' Or N.ORIGINAL_NAME='MANAGE_ROLE_RESOURCE' Or N.ORIGINAL_NAME='MANAGE_RESOURCE' Or N.ORIGINAL_NAME='MANAGE_RESOURCETYPE'
----结果如下:
    BIN$mQKHlhIJcBbgQ6wQgAZwFg==$0   MANAGE_ROLE_RESOURCE
    BIN$mQKHlhH+cBbgQ6wQgAZwFg==$0   MANAGE_RESOURCETYPE
    BIN$mQKHlhH5cBbgQ6wQgAZwFg==$0   MANAGE_RESOURCE
    BIN$mQKHlhH0cBbgQ6wQgAZwFg==$0   MANAGE_MENU
------------------------------------
--根据回收站表名查出删除操作的SQL语句
------------------------------------
Select * From Sys.tmp_logmnr11 t
Where (T.OPENDTEST='DELETE' Or T.OPENDTEST='UPDATE') And SEG_OWNER='OPENDTEST'
And SEG_NAME In(
Select OBJECT_NAME From Dba_Recyclebin N
Where N.ORIGINAL_NAME='MANAGE_MENU' Or N.ORIGINAL_NAME='MANAGE_ROLE_RESOURCE' Or N.ORIGINAL_NAME='MANAGE_RESOURCE' Or N.ORIGINAL_NAME='MANAGE_RESOURCETYPE'
)
------------------------------------
--再来看看删除语句条数
------------------------------------
Select Count(seg_name),seg_name From Sys.tmp_logmnr11 t
Where (T.OPENDTEST='DELETE' Or T.OPENDTEST='UPDATE') And SEG_OWNER='OPENDTEST'
And SEG_NAME In(
Select OBJECT_NAME From Dba_Recyclebin N
Where N.ORIGINAL_NAME='MANAGE_MENU' Or N.ORIGINAL_NAME='MANAGE_ROLE_RESOURCE' Or N.ORIGINAL_NAME='MANAGE_RESOURCE' Or N.ORIGINAL_NAME='MANAGE_RESOURCETYPE'
)
Group By seg_name
----结果如下(与现有数据库四张表数量相同):
    COUNT(SEG_NAME)    SEG_NAME
   
692             BIN$mQKHlhH0cBbgQ6wQgAZwFg==$0                             
   
1591           BIN$mQKHlhIJcBbgQ6wQgAZwFg==$0                              
   
732             BIN$mQKHlhH5cBbgQ6wQgAZwFg==$0                               
   
34               BIN$mQKHlhH+cBbgQ6wQgAZwFg==$0
 
9、提取到最终的DELETE和UPDATE语句
------------------------------------
--各类型删除语句摘录一条
------------------------------------
--------------------
--MANAGE_MENU
--------------------
Select * From Sys.tmp_logmnr11 t
Where (T.OPENDTEST='DELETE' Or T.OPENDTEST='UPDATE') And SEG_OWNER='OPENDTEST'
And SEG_NAME='BIN$mQKHlhH0cBbgQ6wQgAZwFg==$0'
----结果如下:总共692条,2011-1-4 16:09:13---2011-1-4 16:09:19
    首条SQL_REDO:update "OPENDTEST"."BIN$mQKHlhH0cBbgQ6wQgAZwFg==$0" set "RESOURCE_ID" = NULL where "RESOURCE_ID" = '10800' and ROWID = 'AAARoIAAUAAABXmAAD';
    翻译后的SQL语句:update "OPENDTEST"."MANAGE_MENU" set "RESOURCE_ID" = NULL where "RESOURCE_ID" = '10800' and ROWID = 'AAARoIAAUAAABXmAAD';

------------------------
--MANAGE_ROLE_RESOURCE
------------------------
Select * From Sys.tmp_logmnr11 t
Where (T.OPENDTEST='DELETE' Or T.OPENDTEST='UPDATE') And SEG_OWNER='OPENDTEST'
And SEG_NAME='BIN$mQKHlhIJcBbgQ6wQgAZwFg==$0'
----结果如下:总共1591条,2011-1-4 16:09:13---2011-1-4 16:09:19
    首条SQL_REDO:delete from "OPENDTEST"."BIN$mQKHlhIJcBbgQ6wQgAZwFg==$0" where "RESOURCE_ID" = '10800' and "ROLE_ID" = '11600' and ROWID = 'AAARocAAYAAAA5XAAL';
    翻译后的SQL语句:delete from "OPENDTEST"."MANAGE_ROLE_RESOURCE" where "RESOURCE_ID" = '10800' and "ROLE_ID" = '11600' and ROWID = 'AAARocAAYAAAA5XAAL';


 
10、经过与架构组协同分析代码和数据及表结构,确认WEB页面存在可疑风险