一. 检查共享设备
一般情况下, 存放OCR 和 Voting Disk 的OCFS2 或者raw 都是自动启动的。 如果他们没有启动,RAC 肯定是启动不了的。
1.1 如果使用ocfs2的,检查ocfs2 状态
/etc/init.d/o2cb status
在挂载之前,/etc/init.d/o2cb status 显示为Checking O2CB heartbeat: Not active。
在格式化和挂载文件系统之前,应验证 O2CB 在两个节点上均联机;O2CB 心跳当前没有
活动,因为文件系统还没有挂载 。挂载之后就会变成active。
mount -t ocfs2 -o datavolume /dev/sdb1 /u02/oradata/orcl
1,2. 如果使用raw device.
[root@raw1 ~]# cd /dev/raw/
[root@raw1 raw]# ls
raw1 raw2
或者:
[root@raw1 init.d]# /etc/init.d/rawdevices status
/dev/raw/raw1: bound to major 8, minor 17
/dev/raw/raw2: bound to major 8, minor 18
1.3. 检查ASM
/etc/init.d/oracleasm listdisks
二. 自动启动RAC并检查相关进程
RAC 在启动的时候crs 等进程都是自动启动的:
[root@rac1 init.d]# ls -l /etc/init.d/init.*
-r-xr-xr-x 1 root root 1951 Feb 26 22:38 /etc/init.d/init.crs
-r-xr-xr-x 1 root root 4714 Feb 26 22:38 /etc/init.d/init.crsd
-r-xr-xr-x 1 root root 35394 Feb 26 22:38 /etc/init.d/init.cssd
-r-xr-xr-x 1 root root 3190 Feb 26 22:38 /etc/init.d/init.evmd
我们要查看一下crs 的状态:
正常情况下, 进程都是online的:
[root@raw1 bin]# ./crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora.raw.db application ONLINE ONLINE raw1
ora.raw.raw.cs application ONLINE ONLINE raw1
ora....aw1.srv application ONLINE ONLINE raw1
ora....aw2.srv application ONLINE ONLINE raw2
ora....w1.inst application ONLINE ONLINE raw1
ora....w2.inst application ONLINE ONLINE raw2
ora....SM1.asm application ONLINE ONLINE raw1
ora....W1.lsnr application ONLINE ONLINE raw1
ora.raw1.gsd application ONLINE ONLINE raw1
ora.raw1.ons application ONLINE ONLINE raw1
ora.raw1.vip application ONLINE ONLINE raw1
ora....SM2.asm application ONLINE ONLINE raw2
ora....W2.lsnr application ONLINE ONLINE raw2
ora.raw2.gsd application ONLINE ONLINE raw2
ora.raw2.ons application ONLINE ONLINE raw2
ora.raw2.vip application ONLINE ONLINE raw2
如果出现以下情况:
[root@rac2 bin]# ./crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora.rac.db application ONLINE UNKNOWN rac1
ora....orcl.cs application ONLINE UNKNOWN rac1
ora....ac1.srv application OFFLINE OFFLINE
ora....ac2.srv application OFFLINE OFFLINE
ora....c1.inst application ONLINE UNKNOWN rac1
ora....c2.inst application ONLINE UNKNOWN rac2
ora....SM1.asm application ONLINE ONLINE rac1
ora....C1.lsnr application ONLINE UNKNOWN rac1
ora.rac1.gsd application ONLINE UNKNOWN rac1
ora.rac1.ons application ONLINE ONLINE rac1
ora.rac1.vip application ONLINE ONLINE rac1
ora....SM2.asm application ONLINE ONLINE rac2
ora....C2.lsnr application ONLINE UNKNOWN rac2
ora.rac2.gsd application ONLINE UNKNOWN rac2
ora.rac2.ons application ONLINE ONLINE rac2
ora.rac2.vip application ONLINE ONLINE rac2
解决方法:
1. 用crs_stat 查看进程全部信息:
[root@rac2 bin]# ./crs_stat
NAME=ora.rac.db
TYPE=application
TARGET=ONLINE
STATE=ONLINE on rac2
NAME=ora.rac1.LISTENER_RAC1.lsnr
TYPE=application
TARGET=ONLINE
STATE=UNKNOWN on rac1
NAME=ora.rac1.gsd
TYPE=application
TARGET=ONLINE
STATE=UNKNOWN on rac1
NAME=ora.rac2.LISTENER_RAC2.lsnr
TYPE=application
TARGET=ONLINE
STATE=UNKNOWN on rac2
... ...
2. 对于offline 的进程,我们可以直接手动的启动它
[root@rac2 bin]# ./crs_start ora.rac.orcl.rac1.srv
Attempting to start `ora.rac.orcl.rac1.srv` on member `rac1`
Start of `ora.rac.orcl.rac1.srv` on member `rac1` succeeded.
3. 对于UNKNOWN 的进程,我们可以先stop 它, 在start。
[root@rac2 bin]# ./crs_stop ora.rac2.gsd
Attempting to stop `ora.rac2.gsd` on member `rac2`
Stop of `ora.rac2.gsd` on member `rac2` succeeded.
[root@rac2 bin]# ./crs_start ora.rac2.gsd
Attempting to start `ora.rac2.gsd` on member `rac2`
Start of `ora.rac2.gsd` on member `rac2` succeeded.
4. 如果crs_stop不能结束,crs_start 不能启动的进程,我们有2中方法来解决:
4.1)是用crs_stop -f 参数把crs中状态是UNKNOWN的服务关掉,然后再用crs_start -f (加一个-f的参数)启动所有的服务就可以。要分别在两个节点上执行;
[oracle@rac2 ~]$ crs_start -f ora.ora9i.ora9i2.inst
Attempting to start `ora.ora9i.ora9i2.inst` on member `rac2`
Start of `ora.ora9i.ora9i2.inst` on member `rac2` succeeded.
[oracle@rac2 ~]$ crs_stop -f ora.ora9i.db
Attempting to stop `ora.ora9i.db` on member `rac2`
Stop of `ora.ora9i.db` on member `rac2` succeeded.
4.2)转换到root用户下用/etc/init.d/init.crs stop先禁用crs,然后再用/etc/init.d/init.crs start去启用crs,启用crs后会自动启动crs的一系列服务,注意此种方法需要在两台节点上都执行;
5. 可以用命令一次启动和关闭相关进程
[root@rac2 bin]# ./crs_stop -all
[root@rac2 bin]# ./crs_start -all
三. 手动启动RAC
一般情况下每次节点启动的时候,所有服务都会自动启动,如果需要关闭或者启动某个节点,如下所示
停止RAC:
emctl stop dbconsole
srvctl stop instance -d raw -i raw1
srvctl stop instance -d raw -i raw2
srvctl stop asm -n raw1
srvctl stop asm -n raw2
srvctl stop nodeapps -n raw1
srvctl stop nodeapps -n raw2
启动RAC:
和上面的步骤正好相反即
srvctl start nodeapps -n raw1
srvctl start nodeapps -n raw2
srvctl start asm -n raw1
srvctl start asm -n raw2
srvctl start instance -d raw -i raw2
srvctl start instance -d orcl -i raw1
emctl start dbconsole
使用 SRVCTL 启动/停止所有实例及其启用的服务。
srvctl start database -d orcl
srvctl stop database -d orcl
注:CRS Resource 包括GSD(Global Serveice Daemon),ONS(Oracle Notification Service),VIP, Database, Instance 和 Service. 这些资源被分成2类:
GSD,ONS,VIP 和 Listener 属于Noteapps类
Database,Instance 和Service 属于 Database-Related Resource 类。
有关Oracle RAC 的概念,请参考我的blog:
RAC 的一些概念性和原理性的知识
示例:
[root@raw1 bin]# ./crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora.raw.db application ONLINE ONLINE raw1
ora.raw.raw.cs application ONLINE ONLINE raw1
ora....aw1.srv application ONLINE ONLINE raw1
ora....aw2.srv application ONLINE ONLINE raw2
ora....w1.inst application ONLINE ONLINE raw1
ora....w2.inst application ONLINE ONLINE raw2
ora....SM1.asm application ONLINE ONLINE raw1
ora....W1.lsnr application ONLINE ONLINE raw1
ora.raw1.gsd application ONLINE ONLINE raw1
ora.raw1.ons application ONLINE ONLINE raw1
ora.raw1.vip application ONLINE ONLINE raw1
ora....SM2.asm application ONLINE ONLINE raw2
ora....W2.lsnr application ONLINE ONLINE raw2
ora.raw2.gsd application ONLINE ONLINE raw2
ora.raw2.ons application ONLINE ONLINE raw2
ora.raw2.vip application ONLINE ONLINE raw2
[oracle@raw1 ~]$ emctl stop dbconsole
TZ set to PRC
Oracle Enterprise Manager 10g Database Control Release 10.2.0.1.0
Copyright (c) 1996, 2005 Oracle Corporation. All rights reserved.
http://raw1:1158/em/console/aboutApplication
Stopping Oracle Enterprise Manager 10g Database Control ...
... Stopped.
[oracle@raw1 ~]$ srvctl stop instance -d raw -i raw1
[oracle@raw1 ~]$ srvctl stop instance -d raw -i raw2
[oracle@raw1 ~]$ srvctl stop asm -n raw1
[oracle@raw1 ~]$ srvctl stop asm -n raw2
[oracle@raw1 ~]$ srvctl stop nodeapps -n raw1
[oracle@raw1 ~]$ srvctl stop nodeapps -n raw2
[oracle@raw1 bin]$ crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora.raw.db application OFFLINE OFFLINE
Ora.raw.raw.cs application OFFLINE OFFLINE
ora....aw1.srv application OFFLINE OFFLINE
ora....aw2.srv application OFFLINE OFFLINE
ora....w1.inst application OFFLINE OFFLINE
ora....w2.inst application OFFLINE OFFLINE
ora....SM1.asm application OFFLINE OFFLINE
ora....W1.lsnr application OFFLINE OFFLINE
ora.raw1.gsd application OFFLINE OFFLINE
ora.raw1.ons application OFFLINE OFFLINE
ora.raw1.vip application OFFLINE OFFLINE
ora....SM2.asm application OFFLINE OFFLINE
ora....W2.lsnr application OFFLINE OFFLINE
ora.raw2.gsd application OFFLINE OFFLINE
ora.raw2.ons application OFFLINE OFFLINE
ora.raw2.vip application OFFLINE OFFLINE
四. 在启动的过程中最好检测着crs、ASM和数据库的日志:
crs日志:
[oracle@rac1 ~]$ tail -f /u01/app/oracle/product/10.2.0/crs_1/log/rac1/alertrac1.log
[oracle@rac2 ~]$ tail -f /u01/app/oracle/product/10.2.0/crs_1/log/rac2/alertrac2.log
ASM日志:
[oracle@rac1 ~]$ tail -f /u01/app/oracle/admin/+ASM/bdump/alert_+ASM1.log
[oracle@rac2 ~]$ tail -f /u01/app/oracle/admin/+ASM/bdump/alert_+ASM2.log
数据库日志:
[oracle@rac1 ~]$ tail -f /u01/app/oracle/admin/ora9i/bdump/alert_ora9i1.log
[oracle@rac2 ~]$ tail -f /u01/app/oracle/admin/ora9i/bdump/alert_ora9i2.log
注:tail -f命令可用于监视另一个进程正在写入的文件的增长。