GRID启动遇到的问题
本来运行好的,可是crs_stop -all之后,再启动遇到如下问题

[root@rac2 crsd]# /u01/app/11.2.0/grid/bin/crsctl stat res -t -init
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS      
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.asm
      1        ONLINE  ONLINE       rac2                     Started            
ora.cluster_interconnect.haip
      1        ONLINE  ONLINE       rac2                                        
ora.crf
      1        ONLINE  ONLINE       rac2                                        
ora.crsd
      1        ONLINE  OFFLINE                                
                  
ora.cssd
      1        ONLINE  ONLINE       rac2                                        
ora.cssdmonitor
      1        ONLINE  ONLINE       rac2                                        
ora.ctssd
      1        ONLINE  ONLINE       rac2                     ACTIVE:0           
ora.diskmon
      1        OFFLINE OFFLINE                                                  
ora.drivers.acfs
      1        ONLINE  ONLINE       rac2                                        
ora.evmd
      1        ONLINE  ONLINE       rac2                                        
ora.gipcd
      1        ONLINE  ONLINE       rac2                                        
ora.gpnpd
      1        ONLINE  ONLINE       rac2                                        
ora.mdnsd
      1        ONLINE  ONLINE       rac2                                         
[root@rac2 crsd]# /u01/app/11.2.0/grid/bin/crsctl start res ora.crsd -init
每次启动启动都失败,查看日志

CRSD.log报错如下

------------- SERVER POOLS:
Free [min:0][max:-1][importance:0] NO SERVERS ASSIGNED
Generic [min:2147483647][max:-1][importance:0] [Server Names: rac1 rac2 ]NO SERVERS ASSIGNED
ora.glp [min:0][max:-1][importance:1] [Server Names: rac1 rac2 ][Parent Pools: Generic ]NO SERVERS ASSIGNED

2012-04-10 13:47:52.545: [   CRSPE][1189124416] {2:55003:2} Dumping ICE contents...:ICE operation count: 0
2012-04-10 13:47:52.545: [    CRSD][1189124416] {2:55003:2} Dump State Done.
 
参考 How to Troubleshoot Grid Infrastructure Startup Issues [ID 1050908.1],应该是olr文件出了问题
The solution is to restore a good backup of OLR with "ocrconfig -local -restore ocr_backup_name". By default, OLR will be backed up to $GRID_HOME/cdata/$HOST/backup_$TIME_STAMP.olr once installation is complete.

[root@rac2 cdata]# /u01/app/11.2.0/grid/bin/ocrconfig -local -restore rac2/backup_20120409_213603.olr
PROTL-19: Cannot proceed while the Oracle High Availability Service is running
PROTL-19: Cannot proceed while the Oracle High Availability Service is running
 
 
How to restore OLR in 11.2 Grid Infrastructrue [ID 1193643.1] 
根据metalink,但是这个问题我应该不会报错了,11.2.0.2已经修补了,只有强杀进程


Comment: the PROTL-19 error can be reported in 11.2.0.1 if the OHASD was running. This is because the restore operation restore OCR/OLR directly without check the existence of OHASD. The bug 9789625 has talked about this issue and it was fixed in 11.2.0.2 

[root@rac2 crsd]# ps -ef|grep ohasd
root     15648  8768  0 14:07 pts/1    00:00:00 grep ohasd
root     18743     1  0 09:49 ?        00:00:00 /bin/sh /etc/init.d/init.ohasd run
root     18771     1  0 09:49 ?        00:00:31 /u01/app/11.2.0/grid/bin/ohasd.bin reboot
[root@rac2 crsd]# ps -ef|grep ohasd|grep -v grep|awk '{print $2}'|xargs kill -9
[root@rac2 crsd]# ps -ef|grep ohasd
root     15694     1  0 14:07 ?        00:00:00 /bin/sh /etc/init.d/init.ohasd run
root     15719  8768  0 14:07 pts/1    00:00:00 grep ohasd
[root@rac2 cdata]# /u01/app/11.2.0/grid/bin/ocrconfig -local -restore rac2/backup_20120409_213603.olr
[root@rac2 cdata]# ls -lrth
total 6.6M
drwxrwxr-x 2 grid oinstall 4.0K Apr  9 21:05 rac-cluster
drwxr-xr-x 2 grid oinstall 4.0K Apr  9 21:05 localhost
drwxr-xr-x 2 grid oinstall 4.0K Apr  9 21:36 rac2
-rw------- 1 root oinstall 261M Apr 10 14:07 rac2.olr
 
 
[root@rac1 ~]#  /u01/app/11.2.0/grid/bin/crsctl stop crs
[root@rac2 ~]#  /u01/app/11.2.0/grid/bin/crsctl stop crs
[root@rac1 ~]#  /u01/app/11.2.0/grid/bin/crsctl start crs
[root@rac2 ~]#  /u01/app/11.2.0/grid/bin/crsctl start crs
结果起来了
[root@rac2 ~]# /u01/app/11.2.0/grid/bin/crsctl stat res -t -init
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS      
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.asm
      1        ONLINE  ONLINE       rac2                     Started            
ora.cluster_interconnect.haip
      1        ONLINE  ONLINE       rac2                                        
ora.crf
      1        ONLINE  ONLINE       rac2                                        
ora.crsd
      1        ONLINE  ONLINE       rac2                                        
ora.cssd
      1        ONLINE  ONLINE       rac2                                        
ora.cssdmonitor
      1        ONLINE  ONLINE       rac2                                        
ora.ctssd
      1        ONLINE  ONLINE       rac2                     ACTIVE:0           
ora.diskmon
      1        OFFLINE OFFLINE                                                  
ora.drivers.acfs
      1        ONLINE  ONLINE       rac2                                        
ora.evmd
      1        ONLINE  ONLINE       rac2                                        
ora.gipcd
      1        ONLINE  ONLINE       rac2                                        
ora.gpnpd
      1        ONLINE  ONLINE       rac2                                        
ora.mdnsd
      1        ONLINE  ONLINE       rac2                                         
 
 
 
再用crs_stat -t 查看,如果遇到不自动启动的,手动启动