RHEL6.4+ORACLE 11.2.0.4rac,今天重启一个节点,发现crs起不来,ASM起不来,google了好多资料都不对症,先说下主要症状


[root@rac2 ~]# /oracle/app/grid/product/11.2.0/bin/crsctl start crs
CRS-4640: Oracle High Availability Services is already active
CRS-4000: Command Start failed, or completed with errors.

[root@rac2 ~]# /etc/init.d/init.crs start

CRS-0184: Cannot communicate with the CRS daemon.

[root@rac2 ~]# /oracle/app/grid/product/11.2.0/bin/crsctl start cluster
CRS-2672: Attempting to start 'ora.cssd' on 'rac2'
CRS-2672: Attempting to start 'ora.diskmon' on 'rac2'
CRS-2676: Start of 'ora.diskmon' on 'rac2' succeeded
CRS-2674: Start of 'ora.cssd' on 'rac2' failed
CRS-2679: Attempting to clean 'ora.cssd' on 'rac2'
CRS-2681: Clean of 'ora.cssd' on 'rac2' succeeded
CRS-4000: Command Start failed, or completed with errors.


总之因为ora.cssd根本起不来,查看日志

tail -f /u01/app/11.2.0/grid/log/racdb1/alertracdb1.log 可以观察到

2013-11-05 10:34:41.837:

[cssd(59006)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in /u01/app/11.2.0/grid/log/racdb1/cssd/ocssd.log

2013-11-05 10:34:56.849:

[cssd(59006)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in /u01/app/11.2.0/grid/log/racdb1/cssd/ocssd.log

2013-11-05 10:35:11.864:

[cssd(59006)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in /u01/app/11.2.0/grid/log/racdb1/cssd/ocssd.log

2013-11-05 10:35:26.877:

[cssd(59006)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in /u01/app/11.2.0/grid/log/racdb1/cssd/ocssd.log

2013-11-05 10:35:41.891:

[cssd(59006)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in /u01/app/11.2.0/grid/log/racdb1/cssd/ocssd.log

2013-11-05 10:35:56.906:

[cssd(59006)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in /u01/app/11.2.0/grid/log/racdb1/cssd/ocssd.log




问题出在找不到voting files,查看了MACLEAN LIU的两篇文章感觉后果很严重呀,

针对11.2 RAC丢失OCR和Votedisk所在ASM Diskgroup的恢复手段

http://www.askmaclean.com/archives/11-2-lost-ocr-votedisk-group-recovery.html


在11gR2 RAC中修改ASM DISK Path磁盘路径

http://www.askmaclean.com/archives/howto-change-11gr2-asm-disk-path.html


特别是第二篇文章,因为在多路径软件的选择上最初我也是用的multipath,但是rhel6.4用multipath绑定多路径在开机自检的时候会报错,我就改用IBM的RDAC了,但是我ASM映射的路径都是用multipath做的,难道有问题需要改? 这工作量就大了,各种担心各种google,找到了这个帖子

https://forums.oracle.com/message/10656401


修改 /etc/sysconfig/oracleasm 的两个参数

ORACLEASM_SCANORDER="dm"

ORACLEASM_SCANEXCLUDE="sd"

重启OK 了


分析原因可能是crs的某个进程找不到ASM磁盘了,没有OCR肯定起不来。