感悟

1.从事数据库软件的使用维护工作,看起来似乎很简单,输入几条常用的命令即可解决问题,但是问题在于当系统出现了异常报错时,就变得手足无措不敢进行新的操作,甚至硬着头皮操作,越搞越糟。这些现象说明,我们对软件的运行原理了解的不多,知其然,但不知其所以然。平时看再多的文本资料,不如动手多去实践探索原理本质更有效。

2. 搭建vmware workstation练习环境要舍得投资升级电脑配置。之前,碍于电脑配置低,运行速度慢,重启虚拟机都得等好几分钟,严重影响练习时的效率和心情,对个人来说是一种摧残。一直没有做过oracle 11gR2配置ADG (dataguard)的实验。所以,对ADG一直都是敬而远之。

3.经过对oracle 11gR2配置ADG (dataguard)的过程,自己对oracle运行机制,pfile,spfile,tnsnames,listener等文件的作用更加熟悉了。不再像以前那样照猫画虎、照本宣科了。

配置过程

参考帖子数据库之Oralce ADG 搭建使用(搭建流程、个人理解、出现问题、解决方案)_大&迈的博客-CSDN博客_数据库adg是什么意思

基础环境: 

磁盘空间:30G

操作系统:Asianux3 SP3 64位

数据库:oracle 11.2.0.3.0 - 64bit 

主库:主机名test1,数据库实例名testdb,IP 10.0.0.7

备库:主机名test2,数据库实例名testdb,IP 10.0.0.8

终端软件:MobaXterm 个人认为该软件很好用,理由是当登陆ssh时,在MobaXterm左侧显示一个sftp会话框,方便将本地的安装包等文件上传到linux的相应目录下。勾选左侧底部的“跟随终端文件夹”即可实现,在linux里cd到某个目录时,sftp会话窗同步切换到相应目录下。

安装oracle数据库时遇到的问题:

1. 检测发现swap空间不足

解决办法:

swap扩容:

dd if=/dev/zero of=/ora/swap bs=1024 count=1500000(约1.5G)
mkswap /ora/swap
swapon /ora/swap
 free -m
vi /etc/fstab

插入一行:/ora/swap       swap    swap default 0  0

2.检测发现缺少pdksh-5.2.14包

解决办法:rpm -i --force --nodeps pdksh-5.2.14-24.x86_64.rpm

说明:执行rpm -ivh pdksh-5.2.14-24.x86_64.rpm安装失败,只能强行安装。

一、确保主备数据库运行在日志归档模式下

SQL> archive log list;
Database log mode              Archive Mode
Automatic archival             Enabled
Archive destination            /oracle/app/oracle/arch
Oldest online log sequence     39
Next log sequence to archive   0
Current log sequence           41

在主库上创建standby redo log:

alter database add standby logfile group 4 '/oracle/app/oracle/oradata/testdb/standby_log04.log' size 50M;
alter database add standby logfile group 5 '/oracle/app/oracle/oradata/testdb/standby_log05.log' size 50M;
alter database add standby logfile group 6 '/oracle/app/oracle/oradata/testdb/standby_log06.log' size 50M;
alter database add standby logfile group 7 '/oracle/app/oracle/oradata/testdb/standby_log07.log' size 50M;

二、配置主备监听、tnsnames

1.配置监听:

主库:

[oracle@test1 admin]$ cat $ORACLE_HOME/network/admin/listener.ora
# listener.ora Network Configuration File: /oracle/app/oracle/product/11.2.0/dbhome_1/network/admin/listener.ora
# Generated by Oracle configuration tools.
LISTENER =
  (DESCRIPTION_LIST =
    (DESCRIPTION =
      (ADDRESS = (PROTOCOL = TCP)(HOST = test1)(PORT = 1521))
      (ADDRESS = (PROTOCOL = IPC)(KEY = EXTPROC1521))
    )

  )

ADR_BASE_LISTENER = /oracle/app/oracle

备库:

[oracle@test2 ~]$ cat $ORACLE_HOME/network/admin/listener.ora
LISTENER =
  (DESCRIPTION_LIST =
    (DESCRIPTION =
      (ADDRESS = (PROTOCOL = TCP)(HOST = test2)(PORT = 1521))
      (ADDRESS = (PROTOCOL = IPC)(KEY = EXTPROC1521))
    )
  )
SID_LIST_LISTENER =
  (SID_LIST =
    (SID_DESC =(SID_NAME = testdb)(ORACLE_HOME = /oracle/app/oracle/product/11.2.0/dbhome_1))
  )
ADR_BASE_LISTENER = /oracle/app/oracle

主备库配置完后重启监听。lsnrctl stop,lsnrctl start

2.配置tnsnames.ora:

主库:

cat $ORACLE_HOME/network/admin/tnsnames.ora
main =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = 10.0.0.7)(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = main)
    )
  )
back =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = 10.0.0.8)(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = testdb)
    )
  )

备库:

cat $ORACLE_HOME/network/admin/tnsnames.ora
main =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = 10.0.0.7)(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = main)
    )
  )
back =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = 10.0.0.8)(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = testdb)
    )
  )

 在主备库上分别执行tnsping main,tnsping back测试一下,看是否都能正常解析连接。

3.配置pfile文件里的ADG参数

主库上sql>create pfile='/oracle/app/oracle/product/11.2.0/dbhome_1/dbs' from spfile;

生成文件inittestdb.ora,用vi编辑后保存。

[oracle@test1 dbs]$ cat inittestdb.ora
testdb.__db_cache_size=125829120
testdb.__java_pool_size=4194304
testdb.__large_pool_size=4194304
testdb.__oracle_base='/oracle/app/oracle'#ORACLE_BASE set from environment
testdb.__pga_aggregate_target=197132288
testdb.__sga_target=293601280
testdb.__shared_io_pool_size=0
testdb.__shared_pool_size=142606336
testdb.__streams_pool_size=4194304
*.audit_file_dest='/oracle/app/oracle/admin/testdb/adump'
*.audit_trail='db'
*.compatible='11.2.0.0.0'
*.control_files='/oracle/app/oracle/oradata/testdb/control01.ctl','/oracle/app/oracle/fast_recovery_area/testdb/control02.ctl'
*.db_block_size=8192
*.db_domain=''
*.db_name='testdb'
*.db_recovery_file_dest='/oracle/app/oracle/fast_recovery_area'
*.db_recovery_file_dest_size=4322230272
*.diagnostic_dest='/oracle/app/oracle'
*.dispatchers='(PROTOCOL=TCP) (SERVICE=testdbXDB)'
*.log_archive_format='%t_%s_%r.dbf'
*.memory_target=489684992
*.nls_language='SIMPLIFIED CHINESE'
*.nls_territory='CHINA'
*.open_cursors=300
*.processes=150
*.remote_login_passwordfile='EXCLUSIVE'
*.undo_tablespace='UNDOTBS1'
*.db_unique_name='main'
*.log_archive_config='dg_config=(main,back)'
*.log_archive_dest_1='LOCATION=/oracle/app/oracle/arch valid_for=(all_logfiles,primary_role) db_unique_name=main'
*.log_archive_dest_2='SERVICE=back lgwr async valid_for=(online_logfile,primary_role)  db_unique_name=back'
*.log_archive_dest_state_1=enable
*.log_archive_dest_state_2=enable
*.log_archive_max_processes=4
*.fal_server='back'
*.fal_client='main'
*.standby_file_management='auto'

shutdown immediate;

startup pfile=‘/oracle/app/oracle/product/11.2.0/dbhome_1/dbs/inittestdb.ora’(最后都配置测试通过了,再重建spfile,create spfile from pfile=‘/oracle/app/oracle/product/11.2.0/dbhome_1/dbs/inittestdb.ora’,以后就再启动数据库只执行startup即可。)

备库上的配置:将主库的inittestdb.ora文件拷贝到备库的/oracle/app/oracle/product/11.2.0/dbhome_1/dbs目录下编辑保存,如下图。

[oracle@test2 dbs]$ cat inittestdb.ora
testdb.__db_cache_size=125829120
testdb.__java_pool_size=4194304
testdb.__large_pool_size=4194304
testdb.__oracle_base='/oracle/app/oracle'#ORACLE_BASE set from environment
testdb.__pga_aggregate_target=197132288
testdb.__sga_target=293601280
testdb.__shared_io_pool_size=0
testdb.__shared_pool_size=142606336
testdb.__streams_pool_size=4194304
*.audit_file_dest='/oracle/app/oracle/admin/testdb/adump'
*.audit_trail='db'
*.compatible='11.2.0.0.0'
*.control_files='/oracle/app/oracle/oradata/testdb/control01.ctl','/oracle/app/oracle/fast_recovery_area/testdb/control02.ctl'
*.db_block_size=8192
*.db_domain=''
*.db_name='testdb'
*.db_recovery_file_dest='/oracle/app/oracle/fast_recovery_area'
*.db_recovery_file_dest_size=4322230272
*.diagnostic_dest='/oracle/app/oracle'
*.dispatchers='(PROTOCOL=TCP) (SERVICE=testdbXDB)'
*.log_archive_format='%t_%s_%r.dbf'
*.memory_target=489684992
*.nls_language='SIMPLIFIED CHINESE'
*.nls_territory='CHINA'
*.open_cursors=300
*.processes=150
*.remote_login_passwordfile='EXCLUSIVE'
*.undo_tablespace='UNDOTBS1'
*.db_unique_name='back'
*.log_archive_config='dg_config=(main,back)'
*.log_archive_dest_1='LOCATION=/oracle/app/oracle/arch valid_for=(all_logfiles,primary_role) db_unique_name=back'
*.log_archive_dest_2='SERVICE=main lgwr async valid_for=(online_logfile,primary_role)  db_unique_name=main'
*.log_archive_dest_state_1=enable
*.log_archive_dest_state_2=enable
*.log_archive_max_processes=4
*.fal_server='main'
*.fal_client='back'
*.standby_file_management='auto'

备库执行shutdown immediate;startup nomount pfile=‘/oracle/app/oracle/product/11.2.0/dbhome_1/dbs/inittestdb.ora’ (最后都配置测试通过了,再重建spfile,create spfile from pfile=‘/oracle/app/oracle/product/11.2.0/dbhome_1/dbs/inittestdb.ora’,以后就再启动数据库只执行startup即可。)

意外发现:在配置备库pfile时,其中*.log_archive_dest_1='LOCATION=/oracle/app/oracle/arch valid_for=(all_logfiles,primary_role) db_unique_name=back'

当时,备库操作系统并没有创建arch目录。经过switch logfile测试,主库的数据都能正常同步过来。经查看,备库因为没有arch目录,把归档文件生成在了$ORACLE_BASE下,未产生任何影响。后来创建了/oracle/app/oracle/arch目录,经过switch logfile测试,归档文件生成在了arch目录下。

4.使用orapwd命令创建用户密码以便rman同步数据时用

在主备上分别执行:orapwd file=PWDtestdb.ora password=123456 entries=2 force=y

测试一下:

sqlplus /nolog
	
连接实例: 
SQL> conn sys/123456@main as sysdba
SID_LIST_LISTENER =
  (SID_LIST =
    (SID_DESC =(SID_NAME = testdb)(ORACLE_HOME = /oracle/app/oracle/product/11.2.0/dbhome_1))
  )

5.数据同步复制

在备库上执行(我在主库上执行的,也没发现问题):

rman target sys/123456@main auxiliary sys/123456@back

duplicate target database for standby from active database nofilenamecheck; 
同步时遇到如下错误

rman target sys/123456@main auxiliary sys/123456@back

Recovery Manager: Release 11.2.0.3.0 - Production on Tue Feb 15 02:06:15 2022

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

connected to target database: TESTDB (DBID=2878281152)
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-00554: initialization of internal recovery manager package failed
RMAN-04006: error from auxiliary database: ORA-12528: TNS:listener: all appropriate instances are blocking new connections

原因:备库运行在nomount状态下才能复制,但是“实例虽然启动,但是没有注册到监听。实例是通过PMON进程注册到监听上的,而PMON进程需要在MOUNT状态下才会启动。”
解决办法在备库的listener.ora里添加上sid描述。
SID_LIST_LISTENER =
(SID_LIST =
(SID_DESC =(SID_NAME = testdb)(ORACLE_HOME = /oracle/app/oracle/product/11.2.0/dbhome_1))
)

数据库同步完成后,启动备库。

曾遇到过下图的问题:

打开备库时报错:
SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-10458: standby database requires recovery
ORA-01152: file 1 was not restored from a sufficiently old backup
ORA-01110: data file 1: '/oracle/app/oracle/oradata/testdb/system01.dbf'
排查:
SQL> show parameter LOG_ARCHIVE_DEST_2

NAME                                 TYPE                   VALUE
------------------------------------ ---------------------- ------------------------------
log_archive_dest_2                   string          SERVICE=back lgwr async valid_for=(online_logfile,primary_role)  db_unique_name=main
log_archive_dest_20                  string
log_archive_dest_21                  string
log_archive_dest_22                  string
log_archive_dest_23                  string
log_archive_dest_24                  string
log_archive_dest_25                  string
log_archive_dest_26                  string
log_archive_dest_27                  string
log_archive_dest_28                  string
log_archive_dest_29                  string
原因:SERVICE=back lgwr async valid_for=(online_logfile,primary_role)  db_unique_name=main这里service配置错误了,应该为主库main。
解决过程:打开pfile文件inittestdb.ora修改service=main
然后sqlplus pfile=’/oracle/app/oracle/product/11.2.0/dbhome_1/dbs/inittestdb.ora';就能成功启动数据库了。

6.dataguard数据同步验证:

在主库插入数据,执行alter system switch logfile后,备库未被同步。查看archive log list日志序号是一致的。
解决办法:
alter database recover managed standby database cancel;
alter database  recover managed standby database using current logfile disconnect from session;
以后再在主库插入数据库后,切换日志文件,备库就能被立即同步了。

7.dataguard启停步骤 参考帖子:Oracle DataGuard启动与关闭顺序 - gegeman -

(一)Active DataGuard启动顺序
(1)启动监听,先启从库再起主库
lsnrctl start
(2)启动数据库,先启动备库在启主库

复制代码
--先启备库
sql>startup nomount
sql>alter database mount standby database;
sql>alter database open;
sql>alter database recover managed standby database using current logfile disconnect from session;
        
--再启主库
sql>startup
复制代码
 

(二)DataGuard关闭顺序
(1)先关主库数据库:
sql>shutdown immediate

(2)再关备用库:
sql>alter database recover managed standby database cancel;  --停止同步
sql>shutdown immediate;