AIX5.3 ORACLE10G RAC环境,新建LV时由于CHMOD、CHOWN未在两个节点同步,导致未设置权限的节点报ORA-01186\ORA-01157\ORA-01110错误,后对数据文件紧急下线过程中又出现数据文件问题,最后通过Media recovery恢复。该案例告诉我们务必注意两个节点权限的同步,即使不同步也不要轻易OFFLINE,可尝试重启之前未同步的节点。

--实际步骤--

新建data36_disk数据文件
通过smit lv
PP SIZE:        256 megabyte(s)
   PPs:            10

--权限设置
chown oracle:dba /dev/data36_disk
chown oracle:dba /dev/rdata11_disk
chmod 777 /dev/data36_disk
chmod 777 /dev/rdata11_disk
致命的问题就出在这一步,因为只在RAC的其中一个节点进行了操作,另外一个节点的权限仍然有问题,
虽然下面的add datafile没有报错,但是另外一个节点的日志出现了错误提示:

--向数据库表空间添加数据文件
# su - oracle
[YOU HAVE NEW MAIL]
$ sqlplus / as sysdba
SQL*Plus: Release 10.2.0.4.0 - Production on Sat Feb 12 21:26:42 2011
Copyright (c) 1982, 2007, Oracle.  All Rights Reserved.

Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, Real Application Clusters, OLAP, Data Mining
and Real Application Testing options

SQL> alter tablespace produc_DATA add datafile '/dev/rdata11_disk' size 2500M;
Tablespace altered.

--这个时候RAC2节点出现ERROR
Sat Feb 12 21:33:00 2011
Errors in file /home/oracle/admin/produc/bdump/produc2_dbw0_803734.trc:
ORA-01186: file 45 failed verification tests
ORA-01157: cannot identify/lock data file 45 - see DBWR trace file
ORA-01110: data file 45: '/dev/rdata11_disk'
File 45 not verified due to error ORA-01157

--修改节点2的相关文件权限
chown oracle:dba /dev/data36_disk
chown oracle:dba /dev/rdata11_disk
chmod 777 /dev/data36_disk
chmod 777 /dev/rdata11_disk

--问题依旧,于是紧急下线
SQL> alter database  datafile '/dev/rdata11_disk' offline;
Database altered.

--修改完成,再次上线时出现ORA-01113错误
SQL> alter database datafile '/dev/rdata11_disk' online;
alter database datafile '/dev/rdata11_disk' online
*
ERROR at line 1:
ORA-01113: file 48 needs media recovery
ORA-01110: data file 48: '/dev/rdata11_disk'

--根据上述提示进行recover操作
SQL> recover datafile recover datafile '/dev/data36_disk';
ORA-00279: change 2986334634 generated at 02/12/2011 21:40:35 needed for thread1
ORA-00289: suggestion : /home/oracle/archlog/produc1/1_25181_614088933.arc
ORA-00280: change 2986334634 for thread 1 is in sequence #25181

Specify log: {<RET>=suggested | filename | AUTO | CANCEL}

ORA-00279: change 2986334634 generated at 02/12/2011 19:35:48 needed for thread2
ORA-00289: suggestion : /home/oracle/archlog/produc1/2_17592_614088933.arc
ORA-00280: change 2986334634 for thread 2 is in sequence #17592

Specify log: {<RET>=suggested | filename | AUTO | CANCEL}

ORA-00308: cannot open archived log '/home/oracle/archlog/produc1/2_17592_614088933.arc'
ORA-27037: unable to obtain file status
IBM AIX RISC System/6000 Error: 2: No such file or directory
Additional information: 3

--上面提示cannot open archived log '/home/oracle/archlog/produc1/2_17592_614088933.arc'
--怀疑文件丢失,进入相关目录确认

SQL> exit
Disconnected from Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, Real Application Clusters, OLAP, Data Mining
and Real Application Testing options
$ cd /home/oracle/archlog/produc1/
$ ls -lt|grep 2_17592_614088933
--确认没有文件

--去DATAGUARD备机查归档文件
$ ftp 172.16.3.12
Connected to 172.16.3.12.
220 P560A FTP server (Version 4.2 Fri Feb 3 22:13:23 CST 2006) ready.
Name (172.16.3.12:oracle): root
331 Password required for root.
Password:
230-Last unsuccessful login: Tue Nov 30 07:21:21 BEIST 2010 on ftp from ::ffff:172.16.3.6
230-Last login: Sun Jan 30 10:58:29 BEIST 2011 on ftp from ::ffff:172.16.128.16
230 User root logged in.
ftp> cd /oradata/archlog
250 CWD command successful.
ftp> ls -lt grep|arch_2_17592_614088933.arc
200 PORT command successful.
150 Opening data connection for /bin/ls.
total 118712992
-rw-r-----   1 oracle   dba        95012352 Feb 12 22:04 arch_2_17592_614088933.arc
--查到了文件

--接下来从DATAGUARD机器取文件到主库所在服务器
226 Transfer complete.
ftp> get arch_2_17592_614088933.arc
200 PORT command successful.
150 Opening data connection for arch_2_17592_614088933.arc (95012352 bytes).
226 Transfer complete.
95917522 bytes received in 10.76 seconds (8706 Kbytes/s)
local: arch_2_17592_614088933.arc remote: arch_2_17592_614088933.arc
ftp> bye
221 Goodbye.

--主库上已经可以看到该文件了
$ ls -lt
total 10289816
-rw-r--r--   1 oracle   dba        95012352 Feb 12 22:55 arch_2_17592_614088933.arc
-rw-r-----   1 oracle   dba        97745408 Feb 12 22:14 1_25185_614088933.arc
-rw-r-----   1 oracle   dba        97745920 Feb 12 22:03 1_25184_614088933.arc
……

--登录SQLPLUS
$ sqlplus / as sysdba
SQL*Plus: Release 10.2.0.4.0 - Production on Sat Feb 12 22:56:31 2011
Copyright (c) 1982, 2007, Oracle.  All Rights Reserved.
Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, Real Application Clusters, OLAP, Data Mining
and Real Application Testing options

--再次尝试恢复
SQL> recover datafile '/dev/rdata39_disk';
ORA-00279: change 2986334634 generated at 02/12/2011 21:40:35 needed for thread1
ORA-00289: suggestion : /home/oracle/archlog/produc1/1_25181_614088933.arc
ORA-00280: change 2986334634 for thread 1 is in sequence #25181

Specify log: {<RET>=suggested | filename | AUTO | CANCEL}

ORA-00279: change 2986334634 generated at 02/12/2011 19:35:48 needed for thread2
ORA-00289: suggestion : /home/oracle/archlog/produc1/2_17592_614088933.arc
ORA-00280: change 2986334634 for thread 2 is in sequence #17592


Specify log: {<RET>=suggested | filename | AUTO | CANCEL}

--为什么仍然找不到呢
ORA-00308: cannot open archived log '/home/oracle/archlog/produc1/2_17592_614088933.arc'
ORA-27037: unable to obtain file status
IBM AIX RISC System/6000 Error: 2: No such file or directory
Additional information: 3


SQL> exit
Disconnected from Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, Real Application Clusters, OLAP, Data Mining
and Real Application Testing options

--猛然发现名字不一样的,原来DATAGUARD库中归档的命名方式和主库有区别,"arch_"在主库是没有的
$ ls -lt|grep arch_2_17592_614088933.arc
-rw-r--r--   1 oracle   dba        95012352 Feb 12 22:55 arch_2_17592_614088933.arc

--立即重命名
$ mv arch_2_17592_614088933.arc 2_17592_614088933.arc

--登录SQLPLUS,执行RECOVERY
$ sqlplus / as sysdba

SQL*Plus: Release 10.2.0.4.0 - Production on Sat Feb 12 23:08:06 2011

Copyright (c) 1982, 2007, Oracle.  All Rights Reserved.

Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, Real Application Clusters, OLAP, Data Mining
and Real Application Testing options

SQL> recover datafile '/dev/rdata11_disk';
ORA-00279: change 2986326858 generated at 02/12/2011 21:33:00 needed for thread1
ORA-00289: suggestion : /home/oracle/archlog/produc1/1_25181_614088933.arc
ORA-00280: change 2986326858 for thread 1 is in sequence #25181

Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
--直接回车

ORA-00279: change 2986326858 generated at 02/12/2011 19:35:48 needed for thread2
ORA-00289: suggestion : /home/oracle/archlog/produc1/2_17592_614088933.arc
ORA-00280: change 2986326858 for thread 2 is in sequence #17592

Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
--直接回车

Log applied.
Media recovery complete.
--已经恢复正常了


SQL>