OCR 和 Voting disk 对RAC 来说是非常重要的。 OCR记录节点成员的配置信息,如database、ASM、instance、listener、VIP等CRS资源的配置信息。Voting disk记录节点成员信息,如包含哪些节点成员、节点的添加删除信息记录。在日常维护中需要对他们进行备份。当然OCR 也会自动备份。 当OCR或者Voting disk 出现问题时,有备份的话就使用备份来恢复。 当没有备份的话就只能重建了。

 

官方有关OCR 和 Voting disk重建的文档参考:

       How to Recreate OCR / Voting Disk Accidentally Deleted [ID 399482.1]

       ​​http://www.cndba.cn/Dave/article/962​

 

OCR 和Voting disk 的备份与恢复参考:

       Oracle 10g RAC OCR 和 VotingDisk 的备份与恢复

       ​​http://www.cndba.cn/Dave/article/1183​


先对Voting disk 和OCR做一个备份。

 

[root@rac1 bin]# ./crs_stat -t

Name           Type           Target    State     Host       

------------------------------------------------------------

ora.orcl.db    application    ONLINE    ONLINE    rac2       

ora....oltp.cs application    ONLINE    ONLINE    rac2       

ora....cl1.srv application    ONLINE    ONLINE    rac1       

ora....cl2.srv application    ONLINE    ONLINE    rac2       

ora....l1.inst application    ONLINE    ONLINE    rac1       

ora....l2.inst application    ONLINE    ONLINE    rac2       

ora....SM1.asm application    ONLINE    ONLINE    rac1       

ora....C1.lsnr application    ONLINE    ONLINE    rac1       

ora.rac1.gsd   application    ONLINE    ONLINE    rac1       

ora.rac1.ons   application    ONLINE    ONLINE    rac1       

ora.rac1.vip   application    ONLINE    ONLINE    rac1       

ora....SM2.asm application    ONLINE    ONLINE    rac2       

ora....C2.lsnr application    ONLINE    ONLINE    rac2       

ora.rac2.gsd   application    ONLINE    ONLINE    rac2       

ora.rac2.ons   application    ONLINE    ONLINE    rac2       

ora.rac2.vip   application    ONLINE    ONLINE    rac2       

[root@rac1 bin]# ./crsctl query css votedisk

 0.     0    /dev/raw/raw3

 1.     0    /dev/raw/raw4

 2.     0    /dev/raw/raw5

located 3 votedisk(s)

[root@rac1 bin]# dd if=/dev/raw/raw3 of=/u01/votingdisk.bak

401562+0 records in

401562+0 records out

205599744 bytes (206 MB) copied, 1685.53 seconds, 122 kB/s

[root@rac1 u01]# cd /u01/app/oracle/product/crs/bin/

[root@rac1 bin]# ./ocrconfig -export /u01/ocr.bak

[root@rac1 bin]# ll /u01

total 202132

drwxrwxrwx 3 oracle oinstall      4096 Nov 30 17:08 app

-rwxr-xr-x 1 oracle oinstall   1043097 Nov 30 18:59 clsfmt.bin

-rw-r--r-- 1 root   root        103141 Dec  2 08:38 ocr.bak

-rwxr-xr-x 1 oracle oinstall      5542 Nov 30 19:00 srvctl

-rw-r--r-- 1 root   root     205599744 Dec  2 08:45 votingdisk.bak

 

 

重建具体操作如下:

 

1. 停止所有节点的CRS

[root@rac1 bin]# ./crsctl stop crs

Stopping resources.

Successfully stopped CRS resources

Stopping CSSD.

Shutting down CSS daemon.

Shutdown request successfully issued.

 

2. 备份每个节点的Clusterware Home

[root@rac1 bin]# cd /u01/app/oracle/product/

[root@rac1 product]# ls

10.2.0  crs

[root@rac1 product]# cp crs crs_back

 

3. 在所有节点执行<CRS_HOME>/install/rootdelete.sh 命令

[root@rac1 install]# pwd

/u01/app/oracle/product/crs/install

[root@rac1 install]# ./rootdelete.sh

Shutting down Oracle Cluster Ready Services (CRS):

Stopping resources.

Error while stopping resources. Possible cause: CRSD is down.

Stopping CSSD.

Unable to communicate with the CSS daemon.

Shutdown has begun. The daemons should exit soon.

Checking to see if Oracle CRS stack is down...

Oracle CRS stack is not running.

Oracle CRS stack is down now.

Removing script for Oracle Cluster Ready services

Updating ocr file for downgrade

Cleaning up SCR settings in '/etc/oracle/scls_scr'

 

4. 在执行安装的节点执行<CRS_HOME>/install/rootdeinstall.sh命令

       因为我是在rac1节点上执行安装的, 所以也在该节点执行该命令。 只需要在该节点执行就可以了。

 

[root@rac1 install]# sh /u01/app/oracle/product/crs/install/rootdeinstall.sh

Removing contents from OCR mirror device

2560+0 records in

2560+0 records out

10485760 bytes (10 MB) copied, 108.972 seconds, 96.2 kB/s

Removing contents from OCR device

2560+0 records in

2560+0 records out

10485760 bytes (10 MB) copied, 89.2502 seconds, 117 kB/s

 

5. 检查CRS进程,如果没有返回值,继续下一步

[root@rac1 install]# ps -e | grep -i 'ocs[s]d'

[root@rac1 install]# ps -e | grep -i 'cr[s]d.bin'

[root@rac1 install]# ps -e | grep -i 'ev[m]d.bin'

 

6. 在安装节点(第4步中的节点)执行<CRS_HOME>/root.sh命令

[root@rac1 crs]# /u01/app/oracle/product/crs/root.sh --注意,是root用户。

WARNING: directory '/u01/app/oracle/product' is not owned by root

WARNING: directory '/u01/app/oracle' is not owned by root

WARNING: directory '/u01/app' is not owned by root

WARNING: directory '/u01' is not owned by root

Checking to see if Oracle CRS stack is already configured

 

Setting the permissions on OCR backup directory

Setting up NS directories

Oracle Cluster Registry configuration upgraded successfully

WARNING: directory '/u01/app/oracle/product' is not owned by root

WARNING: directory '/u01/app/oracle' is not owned by root

WARNING: directory '/u01/app' is not owned by root

WARNING: directory '/u01' is not owned by root

assigning default hostname rac1 for node 1.

assigning default hostname rac2 for node 2.

Successfully accumulated necessary OCR keys.

Using ports: CSS=49895 CRS=49896 EVMC=49898 and EVMR=49897.

node <nodenumber>: <nodename> <private interconnect name> <hostname>

node 1: rac1 rac1-priv rac1

node 2: rac2 rac2-priv rac2

Creating OCR keys for user 'root', privgrp 'root'..

Operation successful.

Now formatting voting device: /dev/raw/raw3

Now formatting voting device: /dev/raw/raw4

Now formatting voting device: /dev/raw/raw5

Format of 3 voting devices complete.

Startup will be queued to init within 90 seconds.

Adding daemons to inittab

Expecting the CRS daemons to be up within 600 seconds.

CSS is active on these nodes.

        rac1

CSS is inactive on these nodes.

        rac2

Local node checking complete.

Run root.sh on remaining nodes to start CRS daemons.

 

 

7. 在剩下的节点执行<CRS_HOME>/root.sh命令

 

[root@rac2 crs]# /u01/app/oracle/product/crs/root.sh

WARNING: directory '/u01/app/oracle/product' is not owned by root

WARNING: directory '/u01/app/oracle' is not owned by root

WARNING: directory '/u01/app' is not owned by root

WARNING: directory '/u01' is not owned by root

Checking to see if Oracle CRS stack is already configured

 

Setting the permissions on OCR backup directory

Setting up NS directories

Oracle Cluster Registry configuration upgraded successfully

WARNING: directory '/u01/app/oracle/product' is not owned by root

WARNING: directory '/u01/app/oracle' is not owned by root

WARNING: directory '/u01/app' is not owned by root

WARNING: directory '/u01' is not owned by root

clscfg: EXISTING configuration version 3 detected.

clscfg: version 3 is 10G Release 2.

assigning default hostname rac1 for node 1.

assigning default hostname rac2 for node 2.

Successfully accumulated necessary OCR keys.

Using ports: CSS=49895 CRS=49896 EVMC=49898 and EVMR=49897.

node <nodenumber>: <nodename> <private interconnect name> <hostname>

node 1: rac1 rac1-priv rac1

node 2: rac2 rac2-priv rac2

clscfg: Arguments check out successfully.

 

NO KEYS WERE WRITTEN. Supply -force parameter to override.

-force is destructive and will destroy any previous cluster

configuration.

Oracle Cluster Registry for cluster has already been initialized

Startup will be queued to init within 90 seconds.

Adding daemons to inittab

Expecting the CRS daemons to be up within 600 seconds.

CSS is active on these nodes.

        rac1

        rac2

CSS is active on all nodes.

Waiting for the Oracle CRSD and EVMD to start

Waiting for the Oracle CRSD and EVMD to start

Waiting for the Oracle CRSD and EVMD to start

Waiting for the Oracle CRSD and EVMD to start

Waiting for the Oracle CRSD and EVMD to start

Waiting for the Oracle CRSD and EVMD to start

Waiting for the Oracle CRSD and EVMD to start

Oracle CRS stack installed and running under init(1M)

Running vipca(silent) for configuring nodeapps

Error 0(Native: listNetInterfaces:[3])

  [Error 0(Native: listNetInterfaces:[3])]

 

这里报错了。 root.sh 在最后一个节点执行时会调用vipca命令。 这里因为网络接口没有配置好。 所以执行失败了。 我们配置一下接口,在Xmanager里,用root用户,手工运行vipca命令即可。

 

[root@rac1 bin]# ./oifcfg getif  -- 没有返回接口信息

[root@rac1 bin]# ./oifcfg iflist

eth1  192.168.6.0

virbr0  192.168.122.0

eth0  192.168.6.0

[root@rac1 bin]# ./oifcfg setif -global eth0/192.168.6.0:public -- 注意IP 最后是0

[root@rac1 bin]# ./oifcfg setif -global eth1/192.168.6.0:cluster_interconnect

[root@rac1 bin]# ./oifcfg getif   -- 验证配置

eth0  192.168.6.0  global  public

eth1  192.168.6.0  global  cluster_interconnect

[root@rac1 bin]#

 

       配置玩后,随便在一个节点用root用户运行一下vipca命令就可以了。 这个是有窗口的。 需要X 支持。所有用X manager。 其他工具也可以。 能运行就可以了。 执行完后nodeapps的VIP,ONS,GSD就创建完成了。

 

[root@rac1 bin]# ./crs_stat -t

Name           Type           Target    State     Host       

------------------------------------------------------------

ora.rac1.gsd   application    ONLINE    ONLINE    rac1       

ora.rac1.ons   application    ONLINE    ONLINE    rac1       

ora.rac1.vip   application    ONLINE    ONLINE    rac1       

ora.rac2.gsd   application    ONLINE    ONLINE    rac2       

ora.rac2.ons   application    ONLINE    ONLINE    rac2       

ora.rac2.vip   application    ONLINE    ONLINE    rac2       

 

8.  配置监听 (netca)

重建Listener会将监听器信息写入OCR)

 

[oracle@rac1

[oracle@rac2 ~]$ mv $TNS_ADMIN/listener.ora /tmp/listener.ora.original

 

然后在X Manager里,用oracle用户执行netca命令。 这个也是可视化的窗口。

 

[root@rac1 bin]# ./crs_stat -t

Name           Type           Target    State     Host       

------------------------------------------------------------

ora....C1.lsnr application    ONLINE    ONLINE    rac1       

ora.rac1.gsd   application    ONLINE    ONLINE    rac1       

ora.rac1.ons   application    ONLINE    ONLINE    rac1       

ora.rac1.vip   application    ONLINE    ONLINE    rac1       

ora....C2.lsnr application    ONLINE    ONLINE    rac2       

ora.rac2.gsd   application    ONLINE    ONLINE    rac2       

ora.rac2.ons   application    ONLINE    ONLINE    rac2       

ora.rac2.vip   application    ONLINE    ONLINE    rac2       

 

 

9. 配置ONS (racgons)


10g 下使用:

<CRS_HOME>/install/racgons add_config hostname1:port hostname2:port

 

[oracle@rac1 bin]$ pwd

/u01/app/oracle/product/crs/bin

[oracle@rac1 bin]$ racgons add_config rac1:6251 rac2:6251

 

11g 使用:

<CRS_HOME>/install/onsconfig add_config hostname1:port hostname2:port

[oracle@rac1 bin]$ onsconfig add_config rac1:6251 rac2:6251

 

验证配置:

[oracle@rac1 bin]$ onsctl ping

Number of onsconfiguration retrieved, numcfg = 2

onscfg[0]

   {node = rac1, port = 6251}

Adding remote host rac1:6251

onscfg[1]

   {node = rac2, port = 6251}

Adding remote host rac2:6251

ons is running ...

 

如果没有启动,用 onsctl start 启动一下即可。

 

10. 添加其他资源到OCR

注意, 注册用的名字和要之前安装的一样。 区分大小写。

 

ASM

语法:srvctl add asm -n <node_name> -i <asm_instance_name> -o <oracle_home>

 

[oracle@rac1 bin]$ echo $ORACLE_HOME

/u01/app/oracle/product/10.2.0/db_1

[oracle@rac1 bin]$ srvctl add asm -n rac1 -i +ASM1 -o $ORACLE_HOME

[oracle@rac1 bin]$ srvctl add asm -n rac2 -i +ASM2 -o /u01/app/oracle/product/10.2.0/db_1

 

DATABASE

语法:srvctl add database -d <db_unique_name> -o <oracle_home>

[oracle@rac1 bin]$ srvctl add database -d orcl -o /u01/app/oracle/product/10.2.0/db_1

 

INSTANCE

语法:srvctl add instance -d <db_unique_name> -i <instance_name> -n <node_name>

[oracle@rac1 bin]$ srvctl add instance -d orcl -i orcl1 -n rac1

[oracle@rac1 bin]$ srvctl add instance -d orcl -i orcl2 -n rac2

 

 

SERVICE

语法:srvctl add service -d <db_unique_name> -s <service_name> -r <preferred_list> -P <TAF_policy>

       -r preferred_list 是首先使用的实例的列表,还可是用-a 表示备用实例

       TAF_policy可设置为NONE,BASIC,PRECONNECT

[oracle@rac1 bin]$ srvctl add service -d orcl -s oltp -r orcl1,orcl2 -P BASIC

添加完了我们来查看一下:[oracle@rac1 bin]$ crs_stat -tName           Type           Target    State     Host------------------------------------------------------------ora.orcl.db    application    OFFLINE   OFFLINEora....oltp.cs application    OFFLINE   OFFLINEora....cl1.srv application    OFFLINE   OFFLINEora....cl2.srv application    OFFLINE   OFFLINEora....l1.inst application    OFFLINE   OFFLINEora....l2.inst application    OFFLINE   OFFLINEora....SM1.asm application    OFFLINE   OFFLINEora....C1.lsnr application    ONLINE    ONLINE    rac1ora.rac1.gsd   application    ONLINE    ONLINE    rac1ora.rac1.ons   application    ONLINE    ONLINE    rac1ora.rac1.vip   application    ONLINE    ONLINE    rac1ora....SM2.asm application    OFFLINE   OFFLINEora....C2.lsnr application    ONLINE    ONLINE    rac2ora.rac2.gsd   application    ONLINE    ONLINE    rac2ora.rac2.ons   application    ONLINE    ONLINE    rac2ora.rac2.vip   application    ONLINE    ONLINE    rac211. 启动资源和检查

[oracle@rac1 bin]$ srvctl start asm -n rac1

[oracle@rac1 bin]$ srvctl start asm -n rac2

[oracle@rac1 bin]$ srvctl start database -d orcl

[oracle@rac1 bin]$ srvctl start service -d orcl

[root@rac1 bin]# ./crs_stat -tName           Type           Target    State     Host------------------------------------------------------------ora.orcl.db    application    ONLINE    ONLINE    rac1ora....oltp.cs application    ONLINE    ONLINE    rac2ora....cl1.srv application    ONLINE    ONLINE    rac1ora....cl2.srv application    ONLINE    ONLINE    rac2ora....l1.inst application    ONLINE    ONLINE    rac1ora....l2.inst application    ONLINE    ONLINE    rac2ora....SM1.asm application    ONLINE    ONLINE    rac1ora....C1.lsnr application    ONLINE    ONLINE    rac1ora.rac1.gsd   application    ONLINE    ONLINE    rac1ora.rac1.ons   application    ONLINE    ONLINE    rac1ora.rac1.vip   application    ONLINE    ONLINE    rac1ora....SM2.asm application    ONLINE    ONLINE    rac2ora....C2.lsnr application    ONLINE    ONLINE    rac2ora.rac2.gsd   application    ONLINE    ONLINE    rac2ora.rac2.ons   application    ONLINE    ONLINE    rac2ora.rac2.vip   application    ONLINE    ONLINE    rac2[oracle@rac1 bin]$ cluvfy stage -post crsinst -n rac1,rac2
Performing post-checks for cluster services setup
Checking node reachability...
Node reachability check passed from node "rac1".

Checking user equivalence...
User equivalence check passed for user "oracle".

Checking Cluster manager integrity...

Checking CSS daemon...
Daemon status check passed for "CSS daemon".

Cluster manager integrity check passed.

Checking cluster integrity...

Cluster integrity check passed

Checking OCR integrity...

Checking the absence of a non-clustered configuration...
All nodes free of non-clustered, local-only configurations.

Uniqueness check for OCR device passed.

Checking the version of OCR...
OCR of correct Version "2" exists.

Checking data integrity of OCR...
Data integrity check for OCR passed.

OCR integrity check passed.

Checking CRS integrity...

Checking daemon liveness...
Liveness check passed for "CRS daemon".

Checking daemon liveness...
Liveness check passed for "CSS daemon".

Checking daemon liveness...
Liveness check passed for "EVM daemon".

Checking CRS health...
CRS health check passed.

CRS integrity check passed.

Checking node application existence...

Checking existence of VIP node application (required)
Check passed.

Checking existence of ONS node application (optional)
Check passed.

Checking existence of GSD node application (optional)
Check passed.

Post-check for cluster services setup was successful.
[oracle@rac1 bin]$


重建结束。