一.  OHASD 说明

Oracle 的Restart 特性是Oracle 11g里的新特性,在讲这个特性之前先看一下Oracle 11g RAC的进程。之前的Blog 有说明。

Oracle 11gR2RAC 进程说明


Oracle 11gR2 中对CRSD资源进行了重新分类: Local Resources 和 Cluster Resources。 

 

[grid@rac2 ~]$ crsctl stat res -t

--------------------------------------------------------------------------------

NAME           TARGET  STATE       SERVER                  STATE_DETAILS      

--------------------------------------------------------------------------------

Local Resources

--------------------------------------------------------------------------------

ora.DATA.dg

               ONLINE  ONLINE      rac1                                        

               ONLINE  ONLINE      rac2                                        

ora.FRA.dg

               ONLINE  ONLINE      rac1                                        

               ONLINE  ONLINE      rac2                                        

ora.LISTENER.lsnr

               ONLINE  ONLINE      rac1                                        

               ONLINE  ONLINE      rac2                                        

ora.OCRVOTING.dg

               ONLINE  ONLINE      rac1                                         

               ONLINE  ONLINE      rac2                                        

ora.asm

               ONLINE  ONLINE      rac1                    Started            

               ONLINE  ONLINE      rac2                    Started            

ora.gsd

               OFFLINE OFFLINE      rac1                                        

               OFFLINE OFFLINE      rac2                                        

ora.net1.network

               ONLINE  ONLINE      rac1                                         

               ONLINE  ONLINE      rac2                                        

ora.ons

               ONLINE  ONLINE      rac1                                        

               ONLINE  ONLINE      rac2                                        

--------------------------------------------------------------------------------

Cluster Resources

--------------------------------------------------------------------------------

ora.LISTENER_SCAN1.lsnr

     1        ONLINE  ONLINE      rac1                                        

ora.cvu

     1        ONLINE  ONLINE      rac1                                        

ora.oc4j

     1        ONLINE  ONLINE      rac1                                         

ora.rac1.vip

     1        ONLINE  ONLINE      rac1                                        

ora.rac2.vip

     1        ONLINE  ONLINE      rac2                                        

ora.scan1.vip

     1        ONLINE  ONLINE      rac1                                        

ora.sdd.db

     1        ONLINE  ONLINE      rac1                     Open               

     2        ONLINE  ONLINE      rac2                     Open               

[grid@rac2 ~]$

 

 

 

在之前的Blog中,提到Oracle 的命令有分层。

Oracle RAC 常用维护工具和命令

​http://www.cndba.cn/Dave/article/1015​

 

对应起来:

Local Resources 属于应用层,

Cluster Resources 属于集群层。

 

我们这里要说的Oracle Restart 就是对Cluster Resource的一个管理。

 

在Oracle 10g RAC 安装时,在运行root.sh时,会在/etc/inittab文件的最后加入ora.crs,ora.cssd,ora.evmd 三个进程。 这样以后每次系统启动时,Clusterware 也会自动启动,其中EVMD和CRSD 两个进程如果出现异常,则系统会自动重启这两个进程,如果是CSSD 进程异常,系统会立即重启。

 

而在Oracle 11gR2中,只会将ohasd 写入/etc/inittab 文件。

 

官网对OHASD 的说明:

Oracle High Availability Services Daemon(OHASD) :This process anchors the lower part of the Oracle Clusterwarestack, which consists of processes that facilitate cluster operations.

 

可以使用如下命令查看OHASD管理的资源:

[grid@rac2 ~]$ crsctl stat res -init -t

 

 

[grid@rac2 ~]$ ps -ef|grep ohasd

root     1057     1  0 Dec21 ?        00:00:00 /bin/sh /etc/init.d/init.ohasdrun

root     2274     1  0 Dec21 ?        00:22:53/u01/app/grid/11.2.0/bin/ohasd.bin reboot

 

 

 

二.  Oracle Restart 说明

2.1 说明

官网的文档如下:

About Oracle Restart

​http://docs.oracle.com/cd/E11882_01/server.112/e10595/restart001.htm​

 

Oracle Restartimproves the availability of your Oracle database. When you install OracleRestart, various Oracle components can be automatically restarted after ahardware or software failure or whenever your database host computer restarts.

    --Oracle Restart 能提高数据库的可用性,当安装了Oracle Restart 之后,在系统出现硬件或者软件问题,或者主机重启之后,OracleRestart 管理的组件都能自动的进行启动。

 

Oracle Restart 管理的组件如下表:

Component

Notes

Database instance

Oracle Restart can accommodate multiple databases on a single host computer.

Oracle Net listener

-

Database services

Does not include the default service created upon installation because it is automatically managed by Oracle Database, and does not include any default services created during database creation.

Oracle Automatic Storage Management (Oracle ASM) instance

-

Oracle ASM disk groups

Restarting a disk group means mounting it.

Oracle Notification Services (ONS)

In a standalone server environment, ONS can be used in Oracle Data Guard installations for automating failover of connections between primary and standby database through Fast Application Notification (FAN). ONS is a service for sending FAN events to integrated clients upon failover.

 

    OracleRestart 会周期性的检查和监控这些组件的状态,如果发现某个组件fail,那么就会shutdown并restart 该组件。

Oracle Restart 只能用于not-cluster的环境。 对于Oracle RAC 环境,Oracle Clusterware 会提供automatically restart的功能。

 

 

对于非集群环境,只需要安装OracleGrid Infrastructure,在安装的时候选择“仅安装网格基础结构软件”,然后运行如下脚本来安装Oracle Restart:

$GRID_HOME/crs/install/roothas.pl

 

    如果是先安装了Oracle Restart,然后使用dbca创建了实例,那么DBCA会自动的把Oracle 添加到OracleRestart的配置里。 当DBCA启动数据库时,数据库会和其他组件(如disk group)之间建立依赖关系,然后Oracle Restart 开始管理数据库。

 

    当安装了Oracle Restart 后,一些Create操作会自动的创建Oracle 的Compents并将其自动的添加到Oracle Restart configuration中。 这类操作如下表所示:

Create Operation

Created Component Automatically Added to Oracle Restart Configuration?

Create a database with OUI or DBCA

Yes

Create a database with the CREATE DATABASE SQL statement

No

Create an Oracle ASM instance with OUI, DBCA, or ASMCA

Yes

Create a disk group (any method)

Yes

Add a listener with NETCA

Yes

Create a database service with SRVCTL

Yes

Create a database service by modifying the SERVICE_NAMES initialization parameter

No

Create a database service with DBMS_SERVICE.CREATE_SERVICE

No

Create a standby database

No

 

 

同样,一些delete/drop/remove操作也会自动的从Oracle Restart Configuration中进行更新,具体如下表:

Operation

Deleted Component Automatically Removed from Oracle Restart Configuration?

Delete a database with DBCA

Yes

Delete a database by removing database files with operating system commands

No

Delete a listener with NETCA

Yes

Drop an Oracle ASM disk group (any method)

Yes

Delete a database service with SRVCTL

Yes

Delete a database service by any other means

No

 

 

Oracle Restart 由OHASD 进程来管理。 这个就是第一节介绍OHASD的原因。 对于standalone server,使用OHASD 来管理Oracle Restart ,并且不需要CRSD进程的支持。 可以使用OHASD管理的组件如下:

1.CSSD: This is used for Group Services as it was inprevious releases (when it was installed using “localconfig add“)

2.ASM Instance :if Automatic Storage Management is used.

3.ASM Disk Groups: if Automatic Storage Management is used.

4.Listeners

5.Database Instances

6.Database Services

7.ONS/EONS :Used for automatic failover of connections  usingFast Application Notification (FAN) in a Data Guard environment

 

OHASD  是一个后台的守护进程,其可用来启动和监控OracleRestart 进程。 该进程由/etc/init.d/ohasd 脚本来初始化,并有root用户来执行ohasd.bin 来启动。

 

 

使用Oracle Restart 有如下好处:

1.  Automatic resource startup atboot time without using shell scripts or the Oracle supplied dbstart and dbshut scripts.

2.  Resources are started in thecorrect sequence based on dependencies in the OLR(Oracle Local Resource).

3.  Resources are also monitored by ohasd foravailability and may be restarted in place if they fail.

4.  Role managed services for DataGuard.

5.  Consistency of command lineinterfaced tools using crsctl and srvctl as is done withclusters.

 

 

2.2 使用CRSCTL 命令管理Oracle Restart Stack

 

官网的说明如下:

Stopping and Restarting Oracle Restart forMaintenance Operations

​http://docs.oracle.com/cd/E11882_01/server.112/e10595/restart004.htm​

 

CRSCTL Command Reference

​http://docs.oracle.com/cd/E11882_01/server.112/e25494/restart006.htm​

 

CRSCTL 命令可选参数:

Command

Description

​check​

Displays the Oracle Restart status.

​config​

Displays the Oracle Restart configuration.

​disable​

Disables automatic restart of Oracle Restart.

​enable​

Enables automatic restart of Oracle Restart.

​start​

Starts Oracle Restart.

​stop​

Stops Oracle Restart.

 

 

注:

以下操作需要已root用户执行

 

2.2.1 手工停止Oracle Restart: crsctl stop has [-f] 


注意:该命令只争对当前服务器有效。



[root@rac1 bin]# ./crsctl stop has

CRS-2791: Starting shutdown of Oracle HighAvailability Services-managed resources on 'rac1'

CRS-2673: Attempting to stop 'ora.crsd' on'rac1'

CRS-2790: Starting shutdown of ClusterReady Services-managed resources on 'rac1'

CRS-2673: Attempting to stop 'ora.LISTENER_SCAN1.lsnr'on 'rac1'

CRS-2673: Attempting to stop'ora.OCRVOTING.dg' on 'rac1'

CRS-2673: Attempting to stop 'ora.sdd.db'on 'rac1'

CRS-2673: Attempting to stop'ora.LISTENER.lsnr' on 'rac1'

CRS-2673: Attempting to stop 'ora.oc4j' on'rac1'

CRS-2673: Attempting to stop 'ora.cvu' on'rac1'

CRS-2677: Stop of 'ora.LISTENER_SCAN1.lsnr'on 'rac1' succeeded

CRS-2673: Attempting to stop'ora.scan1.vip' on 'rac1'

CRS-2677: Stop of 'ora.LISTENER.lsnr' on'rac1' succeeded

CRS-2673: Attempting to stop 'ora.rac1.vip'on 'rac1'

CRS-2677: Stop of 'ora.rac1.vip' on 'rac1'succeeded

CRS-2672: Attempting to start'ora.rac1.vip' on 'rac2'

CRS-2677: Stop of 'ora.scan1.vip' on 'rac1'succeeded

CRS-2672: Attempting to start 'ora.scan1.vip'on 'rac2'

CRS-2676: Start of 'ora.scan1.vip' on'rac2' succeeded

CRS-2676: Start of 'ora.rac1.vip' on 'rac2'succeeded

CRS-2672: Attempting to start'ora.LISTENER_SCAN1.lsnr' on 'rac2'

CRS-2677: Stop of 'ora.sdd.db' on 'rac1'succeeded

CRS-2673: Attempting to stop 'ora.DATA.dg'on 'rac1'

CRS-2673: Attempting to stop 'ora.FRA.dg'on 'rac1'

CRS-2676: Start of'ora.LISTENER_SCAN1.lsnr' on 'rac2' succeeded

CRS-2677: Stop of 'ora.FRA.dg' on 'rac1'succeeded

CRS-2677: Stop of 'ora.DATA.dg' on 'rac1'succeeded

CRS-2677: Stop of 'ora.oc4j' on 'rac1'succeeded

CRS-2672: Attempting to start 'ora.oc4j' on'rac2'

CRS-2677: Stop of 'ora.cvu' on 'rac1'succeeded

CRS-2672: Attempting to start 'ora.cvu' on'rac2'

CRS-2676: Start of 'ora.cvu' on 'rac2'succeeded

CRS-2677: Stop of 'ora.OCRVOTING.dg' on'rac1' succeeded

CRS-2673: Attempting to stop 'ora.asm' on'rac1'

CRS-2677: Stop of 'ora.asm' on 'rac1'succeeded

CRS-2676: Start of 'ora.oc4j' on 'rac2'succeeded

CRS-2673: Attempting to stop 'ora.ons' on'rac1'

CRS-2677: Stop of 'ora.ons' on 'rac1'succeeded

CRS-2673: Attempting to stop'ora.net1.network' on 'rac1'

CRS-2677: Stop of 'ora.net1.network' on'rac1' succeeded

CRS-2792: Shutdown of Cluster ReadyServices-managed resources on 'rac1' has completed

CRS-2677: Stop of 'ora.crsd' on 'rac1'succeeded

CRS-2673: Attempting to stop 'ora.mdnsd' on'rac1'

CRS-2673: Attempting to stop 'ora.ctssd' on'rac1'

CRS-2673: Attempting to stop 'ora.evmd' on'rac1'

CRS-2673: Attempting to stop 'ora.asm' on'rac1'

CRS-2677: Stop of 'ora.evmd' on 'rac1'succeeded

CRS-2677: Stop of 'ora.mdnsd' on 'rac1'succeeded

CRS-2677: Stop of 'ora.ctssd' on 'rac1'succeeded

CRS-2677: Stop of 'ora.asm' on 'rac1'succeeded

CRS-2673: Attempting to stop'ora.cluster_interconnect.haip' on 'rac1'

CRS-2677: Stop of'ora.cluster_interconnect.haip' on 'rac1' succeeded

CRS-2673: Attempting to stop 'ora.cssd' on'rac1'

CRS-2677: Stop of 'ora.cssd' on 'rac1'succeeded

CRS-2673: Attempting to stop 'ora.crf' on'rac1'

CRS-2677: Stop of 'ora.crf' on 'rac1'succeeded

CRS-2673: Attempting to stop 'ora.gipcd' on'rac1'

CRS-2677: Stop of 'ora.gipcd' on 'rac1'succeeded

CRS-2673: Attempting to stop 'ora.gpnpd' on'rac1'

CRS-2677: Stop of 'ora.gpnpd' on 'rac1'succeeded

CRS-2793: Shutdown of Oracle HighAvailability Services-managed resources on 'rac1' has completed

CRS-4133: Oracle High Availability Serviceshas been stopped.

[root@rac1 bin]#

 


注意:

       我这里测试的是Oracle11gR2的环境,我们在节点1上执行该命令,只把节点1上的进程停了,而把相关的资源转移到我们的节点2上了,因此也证实了我们上面的说的,该命令只争对当前服务器有效。

 

2.2.2 启动HAS

[root@rac1 bin]# ./crsctl start has

CRS-4123: Oracle High Availability Serviceshas been started.

[root@rac1 bin]#

 

从上面看只是启动了HAS。实际上后面会把Oracle Restart 管理的资源都会启动。这个可以使用crs_stat 命令来进程验证,不过Oracle 11g的进程启动过程比较慢,需要耐心等待。

 

[root@rac1 u01]# shcrs_stat.sh

Name                           Target     State     Host     

------------------------------ -------------------  -------  

ora.DATA.dg                    ONLINE     ONLINE    rac1     

ora.FRA.dg                     ONLINE    ONLINE     rac1     

ora.LISTENER.lsnr              ONLINE     ONLINE    rac1     

ora.LISTENER_SCAN1.lsnr        ONLINE     ONLINE    rac2     

ora.OCRVOTING.dg               ONLINE     ONLINE    rac1     

ora.asm                        ONLINE     ONLINE    rac1     

ora.cvu                        ONLINE     ONLINE    rac2     

ora.gsd                        OFFLINE    OFFLINE             

ora.net1.network               ONLINE     ONLINE    rac1     

ora.oc4j                       ONLINE     ONLINE    rac2     

ora.ons                        ONLINE     ONLINE    rac1     

ora.rac1.ASM1.asm              ONLINE     ONLINE    rac1     

ora.rac1.LISTENER_RAC1.lsnr    ONLINE    ONLINE     rac1     

ora.rac1.gsd                   OFFLINE    OFFLINE             

ora.rac1.ons                   ONLINE     ONLINE    rac1     

ora.rac1.vip                   ONLINE     ONLINE    rac1     

ora.rac2.ASM2.asm              ONLINE     ONLINE    rac2     

ora.rac2.LISTENER_RAC2.lsnr    ONLINE    ONLINE     rac2     

ora.rac2.gsd                   OFFLINE    OFFLINE             

ora.rac2.ons                   ONLINE     ONLINE    rac2     

ora.rac2.vip                   ONLINE     ONLINE    rac2     

ora.scan1.vip                  ONLINE     ONLINE    rac2     

ora.sdd.db                     ONLINE     ONLINE    rac2  

 

 

2.2.3 禁用HAS(Restart)在server 重启后的自动启动

[root@rac1 bin]# ./crsctl disable has

CRS-4621: Oracle High Availability Servicesautostart is disabled.

[root@rac1 bin]#

 

2.2.4 查看HAS(Restart)的状态

[root@rac1 bin]# ./crsctl config has

CRS-4621: Oracle High Availability Servicesautostart is disabled.

 

2.2.5 启用HAS(Restart)在server 重启后的自启动

[root@rac1 bin]# ./crsctl enable has

CRS-4622: Oracle High Availability Servicesautostart is enabled.

 

--查看has的状态,验证刚才命令的效果:

[root@rac1 bin]# ./crsctl config has

CRS-4622: Oracle High Availability Servicesautostart is enabled.

[root@rac1 bin]#

 

2.2.6 查看Restart 当前状态

[root@rac1 bin]# ./crsctl check has

CRS-4638: Oracle High Availability Servicesis online

 

 

2.2.7 查看Oracle Restart 中由OHASD管理的resource 状态

[root@rac1 bin]# ./crsctl stat res -t

--------------------------------------------------------------------------------

NAME           TARGET  STATE       SERVER                  STATE_DETAILS      

--------------------------------------------------------------------------------

Local Resources

--------------------------------------------------------------------------------

ora.DATA.dg

               ONLINE  ONLINE      rac1                                        

               ONLINE  ONLINE      rac2                                        

ora.FRA.dg

               ONLINE  ONLINE      rac1                                        

               ONLINE  ONLINE      rac2                                        

ora.LISTENER.lsnr

               ONLINE  ONLINE      rac1                                         

               ONLINE  ONLINE      rac2                                        

ora.OCRVOTING.dg

               ONLINE  ONLINE      rac1                                        

               ONLINE  ONLINE      rac2                                         

ora.asm

               ONLINE  ONLINE      rac1                    Started            

               ONLINE  ONLINE      rac2                    Started            

ora.gsd

               OFFLINE OFFLINE      rac1                                         

               OFFLINE OFFLINE      rac2                                        

ora.net1.network

               ONLINE  ONLINE      rac1                                        

               ONLINE  ONLINE      rac2                                         

ora.ons

               ONLINE  ONLINE      rac1                                        

               ONLINE  ONLINE      rac2                                        

--------------------------------------------------------------------------------

Cluster Resources

--------------------------------------------------------------------------------

ora.LISTENER_SCAN1.lsnr

     1        ONLINE  ONLINE      rac2                                        

ora.cvu

     1        ONLINE  ONLINE      rac2                                        

ora.oc4j

     1        ONLINE  ONLINE      rac2                                        

ora.rac1.vip

     1        ONLINE  ONLINE      rac1                                         

ora.rac2.vip

     1        ONLINE  ONLINE      rac2                                        

ora.scan1.vip

     1        ONLINE  ONLINE      rac2                                        

ora.sdd.db

     1        ONLINE  ONLINE      rac1                     Open               

     2        ONLINE  ONLINE      rac2                     Open               

[root@rac1 bin]#

 

 

2.3 使用SRVCTL 命令管理Restart(OHASD)

可以手工的使用SRVCTL 命令来管理Oracle Restart。从Oracle Restart 配置里添加或者删除一些组件。当我们手工的添加一个组件到到Oracle Restart,并使用SRVCTL启用该组件,那么Oracle Restart 就开始管理该组件,并根据需要决定是否对该组件进行重启。

 

官方文档的说明如下:

SRVCTL Command Reference for Oracle Restart

​http://docs.oracle.com/cd/E11882_01/server.112/e25494/restart005.htm​

 

Configuring OracleRestart

​http://docs.oracle.com/cd/E11882_01/server.112/e10595/restart002.htm​

 

 

SRVCTL命令主要有如下选项:

Command

Description

​add​

Adds a component to the Oracle Restart configuration.

​config​

Displays the Oracle Restart configuration for a component.

​disable​

Disables management by Oracle Restart for a component.

​enable​

Reenables management by Oracle Restart for a component.

​getenv​

Displays environment variables in the Oracle Restart configuration for a database, Oracle ASM instance, or listener.

​modify​

Modifies the Oracle Restart configuration for a component.

​remove​

Removes a component from the Oracle Restart configuration.

​setenv​

Sets environment variables in the Oracle Restart configuration for a database, Oracle ASM instance, or listener.

​start​

Starts the specified component.

​status​

Displays the running status of the specified component.

​stop​

Stops the specified component.

​unsetenv​

Unsets environment variables in the Oracle Restart configuration for a database, Oracle ASM instance, or listener.

 

 

--This example adds thedatabase with the DB_UNIQUE_NAME dbcrm:

srvctl add database -d dbcrm -o/u01/app/oracle/product/11.2.0/dbhome_1

 

--This example adds thesame database and also establishes a dependency between the database and thedisk groups DATA and RECOVERY.

srvctl add database -d dbcrm -o/u01/app/oracle/product/11.2.0/dbhome_1  -a "DATA,RECOVERY"

 

--The following commandadds a listener (named LISTENER) running out of the database Oracle homeand listening on TCP port 1522:

srvctl add listener -p TCP:1522 -o /u01/app/oracle/product/11.2.0/dbhome_1

 

注意srvctl命令中的config 选项,其是用来限制相关Resource 信息的:

[grid@rac1 ~]$ srvctl config asm -a

ASM home: /u01/app/grid/11.2.0

ASM listener: LISTENER

ASM is enabled.

[grid@rac1 ~]$

 

这里srvctl 命令是非常常用的。在2.1 节里也说明了,有些操作会自动的把相关的resource 添加到Restart里,从而来进行监控,但有些操作不会添加到Restart里,这就需要我们手工的来进行添加。

 

 

    最后一天,如果已经安装过了Restart,如果机器名称发生了改变,就需要重新配置Oracle Restart,具体参考MOS 文档:​​[ID986740.1]​​。