敏捷的Oracle 10g MAA配置方法


中国科学院ARP项目实施顾问上海Oracle用户组成员 唐波


摘要


本文介绍一种原创的并经过充分测试的组合使用标准技术的Oracle 10g MAA搭建方法。该方法能有效地利用手边能够利用到的工具,半图形化地、敏捷有效地在10个小时以内搭建出一套Oracle 10g MAA环境。主备库双方节点越多,本方法的敏捷优势越明显。全文内容包括RAC主库和RAC物理standby备库的配置以及最终转换到RAC逻辑standby备库的一步一步操作指南。整个配置包含Broker构造。该方法搭建的MAA,2004年至今应用在中国科学院ARP项目www.arp.cn的生产环境中。


目录

1. 环境准备

1.1 装好主库

1.2 装好备用库两台主机上的集群件、ASM和数据库软件

1.3 修正主库broker位置

2. Grid Control敏捷添加standby

3. 修正Grid Control刚添加的standby的Broker位置

4. 敏捷地图形化切换

5. 在Grid Control上删掉新备库(原主库)

6. 对新主库作rconfig,只能挑选一个节点

7. 对新主库作rconfig后续网络配置

8. 对新主库作添加实例前准备

9. 对新主库使用dbca敏捷添加instance

10. 对新集群主库作dbca后续网络配置

11. Grid界面重发现,敏捷生成Physical Standby的Broker的构造

12. 手工切换准备(两边都是RAC环境使用GC切换一定会报错)

13. 手工切换

14. Grid界面重发现 , 到此MAA Physical Standby配置完成

15. 由MAA Physical Standby 转换到 MAA Logical Standby

15.1 准备主库

15.2 停止主备库两边的Physical Standby的Broker

15.3 准备备库

15.4 主库上EXECUTE DBMS_LOGSTDBY.BUILD

15.5 转换

15.6 Grid界面重发现,敏捷生成Logical Standby的Broker的构造

总结




正文



1. 环境准备

1.1 装好主库


主库那边所有东西都装好:
[root@station1 ~]# crs_stat -t

Name Type Target State Host

ora....A1.inst application ONLINE ONLINE station1

ora....A2.inst application ONLINE ONLINE station2

ora.RDBA.db application ONLINE ONLINE node1

ora....SM1.asm application ONLINE ONLINE station1

ora....N1.lsnr application ONLINE ONLINE station1

ora....on1.gsd application ONLINE ONLINE station1

ora....on1.ons application ONLINE ONLINE station1

ora....on1.vip application ONLINE ONLINE station1

ora....SM2.asm application ONLINE ONLINE station2

ora....N2.lsnr application ONLINE ONLINE station2

ora....on2.gsd application ONLINE ONLINE station2

ora....on2.ons application ONLINE ONLINE station2

ora....on2.vip application ONLINE ONLINE station2


1.2 装好备用库两台主机上的集群件、ASM和数据库软件


备库那边只有集群件和数据库软件,备库所在机器要建好+FRA盘组。主库和备库都要建好目录:'+FRA/库名/stdbyarch'用以存放standby archivelog)。


[root@station3 ~]# crs_stat -t

Name Type Target State Host

ora....SM1.asm application ONLINE ONLINE station3

ora....N1.lsnr application ONLINE ONLINE station3

ora....on1.gsd application ONLINE ONLINE station3

ora....on1.ons application ONLINE ONLINE station3

ora....on1.vip application ONLINE ONLINE station3

ora....SM2.asm application ONLINE ONLINE station4

ora....N2.lsnr application ONLINE ONLINE station4

ora....on2.gsd application ONLINE ONLINE station4

ora....on2.ons application ONLINE ONLINE station4

ora....on2.vip application ONLINE ONLINE station4


主库已经有标准的TNS和监听(备库虽然没有库,可以仿照主库对称地把对应的TNS和监听配好,主库的配置文件也要加入备库信息,由于tnsnames.ora会在以下阶段不断被覆盖,所以请备份主库和备库两边的tnsnames.ora,如果被覆盖则把tnsnames.ora的备份覆盖回来)。

具体操作为打开主库创建过程中生成的tnsnames.ora

本来有4段编辑为8段做对称的修改,4个节点每个节点覆盖一份同时备份一个模板准备以后用:


RDBA =

(DESCRIPTION =

(ADDRESS = (PROTOCOL = TCP)(HOST = station1-vip)(PORT = 1521))

(ADDRESS = (PROTOCOL = TCP)(HOST = station2-vip)(PORT = 1521))

(LOAD_BALANCE = yes)

(CONNECT_DATA =

(SERVER = DEDICATED)

(SERVICE_NAME = RDBA)

)

)


RDBA2 =

(DESCRIPTION =

(ADDRESS = (PROTOCOL = TCP)(HOST = station2-vip)(PORT = 1521))

(CONNECT_DATA =

(SERVER = DEDICATED)

(SERVICE_NAME = RDBA)

(INSTANCE_NAME = RDBA2)

)

)


RDBA1 =

(DESCRIPTION =

(ADDRESS = (PROTOCOL = TCP)(HOST = station1-vip)(PORT = 1521))

(CONNECT_DATA =

(SERVER = DEDICATED)

(SERVICE_NAME = RDBA)

(INSTANCE_NAME = RDBA1)

)

)


LISTENERS_RDBA =

(ADDRESS_LIST =

(ADDRESS = (PROTOCOL = TCP)(HOST = station1-vip)(PORT = 1521))

(ADDRESS = (PROTOCOL = TCP)(HOST = station2-vip)(PORT = 1521))

)


SITE1 =

(DESCRIPTION =

(ADDRESS = (PROTOCOL = TCP)(HOST = station3-vip.example.com)(PORT = 1521))

(ADDRESS = (PROTOCOL = TCP)(HOST = station4-vip.example.com)(PORT = 1521))

(CONNECT_DATA =

(SERVER = DEDICATED)

(SERVICE_NAME = site1)

)

)


DG22 =

(DESCRIPTION =

(ADDRESS = (PROTOCOL = TCP)(HOST = station4-vip.example.com)(PORT = 1521))

(CONNECT_DATA =

(SERVER = DEDICATED)

(SERVICE_NAME = site1)

(INSTANCE_NAME = dg22)

)

)


DG21 =

(DESCRIPTION =

(ADDRESS = (PROTOCOL = TCP)(HOST = station3-vip.example.com)(PORT = 1521))

(CONNECT_DATA =

(SERVER = DEDICATED)

(SERVICE_NAME = site1)

(INSTANCE_NAME = dg21)

)

)


LISTENERS_SITE1 =

(ADDRESS_LIST =

(ADDRESS = (PROTOCOL = TCP)(HOST = station3-vip)(PORT = 1521))

(ADDRESS = (PROTOCOL = TCP)(HOST = station4-vip)(PORT = 1521))

)


1.3 修正主库broker位置


主库改dg_broker_config_file1='+DATA/rdba/dr1rdba.dat', dg_broker_config_file2='+FRA/rdba/dr2rdba.dat'。


2. Grid Control敏捷添加standby


用gc添加单实例的physical standby(gc无法添加RAC 的physical standby),之后verify加standby log(共12个),最后在gc上删掉新加的standby 。

具体操作:进入gc


Target->database->maintenance-> datagurd setup and manage->add standby


database->create a new physical standby dagabase->选择perform a live


backup of the primary->next 一步一步来。建好后verify

可以看见12个日志。


这时在gc上删除新加的standby数据库。



3. 修正Grid Control刚添加的standby的Broker位置


对stdandby改dg1_broker_config_file1='+DATA/site1/dr1site1.dat',dg_broker_config_file2='+FRA/site1/dr2site1.dat' ,之后在主库方重新加回standby(Manage existing standby database链接,目的是为了产生新的dg_broker_config_file1和dg_broker_config_file2) 。Standby Archive Location 属性edit : 选盘阵路径'+FRA/库名/stdbyarch'。


4. 敏捷地图形化切换


在gc上Target->database->maintenance-> datagurd setup and manage->Switchover


gc对一边RAC另一边是单实例的环境切换,一般是能顺利完成的。


5. 在Grid Control上删掉新备库(原主库)



为了在rconfig时不报错。(带上了broker配置的数据库在rconfig时,一定会报错)



6. 对新主库作rconfig,只能挑选一个节点


Oracle用户登陆新主库所在机器,进入下面的路径


/u01/app/oracle/product/10.2.0/db_1/assistants/rconfig/samplesXMLs


修改converttorac.xml 文件,然后执行 rconfig converttorac.xml命令。


以下列出修改后的converttorac.xml 文件:


<?xml version="1.0" encoding="UTF-8"?>

<n:RConfig xmlns:n="http://www.oracle.com/rconfig"

xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

xsi:schemaLocation="http://www.oracle.com/rconfig">

<n:ConvertToRAC>

<!-- Verify does a precheck to ensure all pre-requisites are met, before the conversion is attempted. Allowable values are: YES|NO|ONLY -->

<n:Convert verify="YES">

<!--Specify current OracleHome of non-rac database for SourceDBHome -->

<n:SourceDBHome>/u01/app/oracle/product/10.2.0/db_1</n:SourceDBHome>

<!--Specify OracleHome where the rac database should be configured. It can be same as SourceDBHome -->

<n:TargetDBHome>/u01/app/oracle/product/10.2.0/db_1</n:TargetDBHome>

<!--Specify SID of non-rac database and credential. User with sysdba role is required to perform conversion -->

<n:SourceDBInfo SID="dg2">

<n:Credentials>

<n:User>sys</n:User>

<n:Password>oracle1</n:Password>

<n:Role>sysdba</n:Role>

</n:Credentials>

</n:SourceDBInfo>

<!--ASMInfo element is required only if the current non-rac database uses ASM Storage -->

<n:ASMInfo SID="+ASM1">

<n:Credentials>

<n:User>sys</n:User>

<n:Password>oracle1</n:Password>

<n:Role>sysdba</n:Role>

</n:Credentials>

</n:ASMInfo>

<!--Specify the list of nodes that should have rac instances running. LocalNode should be the first node in this nodelist. -->

<n:NodeList>

<n:Node name="station3"/>

</n:NodeList>

<!--Specify prefix for rac instances. It can be same as the instance name for non-rac database or different. The instance number will be attached to this prefix. -->

<n:InstancePrefix>dg2</n:InstancePrefix>

<!--Specify port for the listener to be configured for rac database.If port="", alistener existing on localhost will be used for rac database.The listener will be extended to all nodes in the nodelist -->

<n:Listener port="1521"/>

<!--Specify the type of storage to be used by rac database. Allowable values are CFS|ASM. The non-rac database should have same storage type. -->

<n:SharedStorage type="ASM">

<n:User>sys</n:User>

<n:Password>oracle1</n:Password>

<n:Role>sysdba</n:Role>

</n:Credentials>

</n:ASMInfo>

<!--Specify the list of nodes that should have rac instances running. LocalNode should be the first node in this nodelist. -->

<n:NodeList>

<n:Node name="station3"/>

</n:NodeList>

<!--Specify prefix for rac instances. It can be same as the instance name for non-rac database or different. The instance number will be attached to this prefix. -->

<n:InstancePrefix>dg2</n:InstancePrefix>

<!--Specify port for the listener to be configured for rac database.If port="", alistener existing on localhost will be used for rac database.The listener will be extended to all nodes in the nodelist -->

<n:Listener port="1521"/>

<!--Specify the type of storage to be used by rac database. Allowable values are CFS|ASM. The non-rac database should have same storage type. -->

<n:SharedStorage type="ASM">

<!--Specify Database Area Location to be configured for rac database.If this field is left empty, current storage will be used for rac database. For CFS, this field will have directory path. -->

<n:TargetDatabaseArea>+DATA</n:TargetDatabaseArea>

<!--Specify Flash Recovery Area to be configured for rac database. If this field is left empty, current recovery area of non-rac database will be configured for rac database. If current database is not using recovery Area, the resulting rac database will not have a recovery area. -->

<n:TargetFlashRecoveryArea>+FRA</n:TargetFlashRecoveryArea>

</n:SharedStorage>

</n:Convert>

</n:ConvertToRAC>

</n:RConfig>


只能挑选一个节点,因为新主库已经enable了thread 2(新备库是个RAC)。如果converttorac.xml 中的

<n:NodeList>

<n:Node name="station3"/>

</n:NodeList>

写成

<n:NodeList>

<n:Node name="station3"/>

<n:Node name="station4"/>

</n:NodeList>

在rconfig过程中一定会报错。


7. 对新主库作rconfig后续网络配置


crs_unregister rconfig的监听检查新集群主库的TNS和监听保持对称配置), 特别还要检查新集群主库的local_listener应该='' ,两边检查local_listener去掉多余的local_listener.sid='dg21'项目。


8. 对新主库作添加实例前准备


新集群主库改log_ahcive_dest_1='', log_archive_dest_2='' (如果不修改,dbca时一定报错,应该是字符窜处理上的bug)。


9. 对新主库使用dbca敏捷添加instance


添加时新实例挑thread线索2 ,日志组挑3,4。undo挑undotablespace2 。

(默认值是thread线索3,日志组是5,6,undo是undotablespace3,这是机器顺延数值的结果。把它们改回来,忽略所有已存在报错,确认进度条走到底)。


10. 对新集群主库作dbca后续网络配置


检查新双实例集群主库的TNS和监听(保持对称配置), 特别还要检查新双实例集群主库所有实例上的local_listener应该='' ,取消注册多出来不对的监听器, 去掉所有实例上的remote_listener=''参数。


11. Grid界面重发现,敏捷生成Physical Standby的Broker的构造


加回dataguard(Manage existing standby database链接)。新双实例集群主库: Standby Archive Location 属性edit : 选盘阵路径'+FRA/库名/stdbyarch';应用实例选择第一个实例。 verify整个构造。

罕见故障处理:

如果出现日志partially apply的问题,说明日志被broker弄到/u01/app/oracle/product/10.2.0/db_1/dbs下面。需要用rman拷到'+FRA/库名/stdbyarch': backup as copy archivelog sequence XXX thread XXX format '+FRA/库名/stdbyarch/1_XXX_123456.dbf' delete input ;


12. 手工切换准备(两边都是RAC环境使用GC切换一定会报错)


新双实例集群主库:alter system set dg_broker_start=false;原主库不要做。


13. 手工切换


确认主库和从库间网络连接通畅;

确认没有活动的会话连接在数据库中;

PRIMARY数据库处于打开的状态,STANDBY数据库处于MOUNT状态;确保STANDBY数据库处于ARCHIVELOG模式;

如果设置了REDO应用的延迟,那么将这个设置去掉;

确保配置了主库和从库的初始化参数,使得切换完成后,DATA GUARD机制可以顺利的运行。

如果是最大保护模式,先变成最大性能模式。

切换的顺序:先从主库到备用,再从备库到主库, 主备库都要停第二个实例(不作log apply的实例)。

srvctl stop instance xxx2 -d xxx

在第一个实例上主切备:

SQL> select DATABASE_ROLE from v$database;

DATABASE_ROLE

----------------

PRIMARY

SQL> select OPEN_MODE,PROTECTION_MODE,PROTECTION_LEVEL,SWITCHOVER_STATUS from v$database;

OPEN_MODE PROTECTION_MODE PROTECTION_LEVEL SWITCHOVER_STATUS

---------- -------------------- -------------------- ------------------

READ WRITE MAXIMUM PERFORMANCE MAXIMUM PERFORMANCE TO STANDBY

SQL> alter database commit to switchover to physical standby with session shutdown ;

Database altered.

SQL> shutdown immediate;

ORA-01507: database not mounted

ORACLE instance shut down.

srvctl start database -d xxx -o mount


SQL> select DATABASE_ROLE from v$database;

DATABASE_ROLE

----------------

PHYSICAL STANDBY

SQL> select OPEN_MODE,PROTECTION_MODE,PROTECTION_LEVEL,SWITCHOVER_STATUS from v$database;

OPEN_MODE PROTECTION_MODE PROTECTION_LEVEL SWITCHOVER_STATUS

---------- -------------------- -------------------- ------------------

MOUNTED MAXIMUM PERFORMANCE MAXIMUM PERFORMANCE TO PRIMARY


备切主:

srvctl stop instance xxx2 -d xxx

在第一个实例上:

SQL> select DATABASE_ROLE from v$database;

DATABASE_ROLE

----------------

PHYSICAL STANDBY

SQL> select OPEN_MODE,PROTECTION_MODE,PROTECTION_LEVEL,SWITCHOVER_STATUS from v$database;

OPEN_MODE PROTECTION_MODE PROTECTION_LEVEL SWITCHOVER_STATUS

---------- -------------------- -------------------- ------------------

MOUNTED MAXIMUM PERFORMANCE MAXIMUM PERFORMANCE TO PRIMARY

SQL> alter database commit to switchover to primary [ with session shutdown ];

Database altered.

SQL> shutdown immediate;

ORA-01507: database not mounted

ORACLE instance shut down.

srvctl start database -d xxx

主库SESSIONS ACTIVE是正常primary状态下的结果。

alter system set dg_broker_start=false;


14. Grid界面重发现 , 到此MAA Physical Standby配置完成


为了重新配置broker, 改新主、备库: dg1_broker_config_file1='+DATA/site1/dr3site1.dat', dg_broker_config_file2='+FRA/site1/dr4site1.dat' ,em界面重发现,重新加回到新主库Standby Archive Location 选盘阵路径(默认)。verify 物理standby正常。


15. 由MAA Physical Standby 转换到MAA Logical Standby


15.1 准备主库


在主库上:

ALTER DATABASE ADD SUPPLEMENTAL LOG DATA (PRIMARY KEY, UNIQUE INDEX) COLUMNS;

alter system archive log current;


15.2 停止主备库两边的Physical Standby的Broker


两边都执行:

alter system set dg_broker_start=false;


15.3 准备备库

在备库上:

ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;

那个实例上做看log_archive_dest_参数指明谁apply。


15.4 主库上EXECUTE DBMS_LOGSTDBY.BUILD


15.5 转换


在备库上改cluster_database为false,重启,选择apply实例:

ALTER DATABASE RECOVER TO LOGICAL STANDBY "备库unique name";

整个过程跟踪alter日志。

备库改cluster_database为true,

ALTER DATABASE GUARD ALL;

所有实例重启到mount:

alter database open resetlogs;

之后把所有实例都启动。

ALTER DATABASE START LOGICAL STANDBY APPLY IMMEDIATE;


15.6 Grid界面重发现,敏捷生成Logical Standby的Broker的构造


verify, 并确定所有日志都apply。


总结


以上步骤和注意事项,基于对原理的理解和充分测试。所有步骤都可重现,作者保证其正确性和原创性。