今天在配置一个备库的时候碰到了一些问题,话说配置dg broker真没什么特别需要注意的细节了,本身已经给DBA省了很大的事儿了。
但是有时候就是会出现一些稀奇古怪的小问题。这个环境又非常重要,备库已经因为硬件故障报废了,现在刚搭的备库就想赶紧把它跑起来。
简单添加配置之后,spfile,防火墙,端口,listener等等因素都满足了。感觉就是一蹴而就的事情了。
但是show configuration的时候就是报错。
DGMGRL> show configuration;
Configuration - test_dg
Protection Mode: MaxPerformance
Databases:
test - Primary database
Error: ORA-16778: redo transport error for one or more databases
stest1 - Physical standby database
Warning: ORA-16792: configurable property value is inconsistent with database setting
Fast-Start Failover: DISABLED
Configuration Status:
ERROR
对于这个问题,常规思路如果想得到更多的明细信息,直接使用verbose方式来查看。
查看主库的verbose信息
DGMGRL> show database verbose test;
Database - test
Role: PRIMARY
Intended State: TRANSPORT-ON
Instance(s): test
Error: ORA-16737: the redo transport service for standby database "stest1" has an error
Properties:
DGConnectIdentifier = 'test'
ObserverConnectIdentifier = ''
LogXptMode = 'ASYNC'
DelayMins = '0'
Binding = 'optional'
MaxFailure = '0'
MaxConnections = '1'
ReopenSecs = '300'
NetTimeout = '30'
RedoCompression = 'DISABLE'
LogShipping = 'ON'
PreferredApplyInstance = ''
ApplyInstanceTimeout = '0'
ApplyParallel = 'AUTO'
StandbyFileManagement = 'AUTO'
ArchiveLagTarget = '0'
LogArchiveMaxProcesses = '2'
LogArchiveMinSucceedDest = '1'
DbFileNameConvert = ''
LogFileNameConvert = ''
FastStartFailoverTarget = ''
InconsistentProperties = '(monitor)'
InconsistentLogXptProps = '(monitor)'
SendQEntries = '(monitor)'
LogXptStatus = '(monitor)'
RecvQEntries = '(monitor)'
SidName = 'test'
StaticConnectIdentifier = '(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=10.127.65.111)(PORT=1535))(CONNECT_DATA=(SERVICE_NAME=test_DGMGRL)(INSTANCE_NAME=test)(SERVER=DEDICATED)))'
StandbyArchiveLocation = 'USE_DB_RECOVERY_FILE_DEST'
AlternateLocation = ''
LogArchiveTrace = '0'
LogArchiveFormat = '%t_%s_%r.dbf'
TopWaitEvents = '(monitor)'
Database Status:
ERROR
查看备库的verbose信息
DGMGRL> show database verbose stest1;
Database - stest1
Role: PHYSICAL STANDBY
Intended State: APPLY-ON
Transport Lag: (unknown)
Apply Lag: (unknown)
Real Time Query: OFF
Instance(s): test
Warning: ORA-16714: the value of property ArchiveLagTarget is inconsistent with the database setting
Properties:
DGConnectIdentifier = 'stest1'
ObserverConnectIdentifier = ''
LogXptMode = 'ASYNC'
DelayMins = '0'
Binding = 'optional'
MaxFailure = '0'
MaxConnections = '1'
ReopenSecs = '300'
NetTimeout = '30'
RedoCompression = 'DISABLE'
LogShipping = 'ON'
PreferredApplyInstance = ''
ApplyInstanceTimeout = '0'
ApplyParallel = 'AUTO'
StandbyFileManagement = 'AUTO'
ArchiveLagTarget = '0'
LogArchiveMaxProcesses = '2'
LogArchiveMinSucceedDest = '1'
DbFileNameConvert = '/U01/app/oracle/oradata/test, /U01/app/oracle/oradata/test, /data/oracle/oradata/test, /U01/app/oracle/oradata/test, /other/app/oracle/oradata/test, /U01/app/oracle/oradata/test, +DATA, /U01/app/oracle/oradata/test, +ARCH, /U01/app/oracle/oradata/test'
LogFileNameConvert = '/U01/app/oracle/oradata/test, /U01/app/oracle/oradata/test, /data/oracle/oradata/test, /U01/app/oracle/oradata/test, /other/app/oracle/oradata/test, /U01/app/oracle/oradata/test, +DATA, /U01/app/oracle/oradata/test, +ARCH, /U01/app/oracle/oradata/test'
FastStartFailoverTarget = ''
InconsistentProperties = '(monitor)'
InconsistentLogXptProps = '(monitor)'
SendQEntries = '(monitor)'
LogXptStatus = '(monitor)'
RecvQEntries = '(monitor)'
SidName = 'test'
StaticConnectIdentifier = '(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=10.11.14.12)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=stest1_DGMGRL)(INSTANCE_NAME=test)(SERVER=DEDICATED)))'
StandbyArchiveLocation = 'USE_DB_RECOVERY_FILE_DEST'
AlternateLocation = ''
LogArchiveTrace = '0'
LogArchiveFormat = '%t_%s_%r.dbf'
TopWaitEvents = '(monitor)'
Database Status:
WARNING
我是横竖看了很多遍,实在是没找出哪里的配置不一致了。
对于这类问题,一般都是推荐查看主库的归档路径,是否出现了不一致,连接不通的问题,或者是db_unique_name的问题。
查看v$archive_dest发现,归档路径2确实显示有问题。
SQL> select dest_id,error from v$archive_dest;
DEST_ID ERROR
---------- -----------------------------------------------------------------
1
2 ORA-16047: DGID mismatch between destination setting and target database
问题的原因说是DGID不匹配。那么来看看归档路径2,这个也是dg broker自动生成的,是在也没发现那里有问题。
log_archive_dest_2 string service="stest1", LGWR ASYNC NO AFFIRM delay=0 optional compression=disable max_failure=0 max_connections=1 reopen=300 db_
unique_name="stest1" net_timeout=30, valid_for=(all_logfiles,primary_role)
查看备库dg broker的日志,发现报出了这么一段警告。但是原因未知。
11/18/2015 18:04:38
Warning: Property 'ArchiveLagTarget' has inconsistent values:METADATA='0', SPFILE='', DATABASE='0'
11/18/2015 18:05:14
Warning: Property 'ArchiveLagTarget' has inconsistent values:METADATA='0', SPFILE='', DATABASE='0'
11/18/2015 18:06:08

查看备库的alert日志,提示接收gap的归档存在问题,我就开始慌了,很重要的一套库,不能有任何闪失,要不又得重来一次了,真感觉实在是太酸爽了。
Error 12541 received logging on to the standby
Check whether the listener is up and running.
FAL[client, USER]: Error 12541 connecting to test for fetching gap sequence
Wed Nov 18 18:02:36 2015
FAL[client]: Failed to request gap sequence
GAP - thread 1 sequence 460503-460515
DBID 1210367666 branch 622336050
FAL[client]: All defined FAL servers have been attempted.
------------------------------------------------------------
Check that the CONTROL_FILE_RECORD_KEEP_TIME initialization
parameter is defined to a value that's sufficiently large
enough to maintain adequate log switch information to retestve
archivelog gaps.
特别申明一下,这些操作都是在上午做的,如果没有发现有什么端倪,就可以继续往下看。
这个时候尝试重建dg broker文件。发现朱备库的dr的文件大小相同,但是时间戳不同。
这个时候查看备库的时间
$ date
Wed Nov 18 18:30:49 CST 2015
发现时间压根就不同步,要和主库的保持一致,还是使用nftp来做。
# /usr/sbin/ntpdate 192.168.131.132
18 Nov 10:32:17 ntpdate[48502]: step time server 192.168.131.132 offset -28854.645360 sec
时间修正之后,再次查看,就没有任何问题了。 知道原因之后再来看看之前的报警日志,发现还是有一定的寓意的。
DGMGRL> show configuration;
Configuration - test_dg
Protection Mode: MaxPerformance
Databases:
test - Primary database
stest1 - Physical standby database
Fast-Start Failover: DISABLED
Configuration Status:
SUCCESS
这个问题也算是早上给自己的一个小警告,一个非常细小的问题就很可能造成很大的延误。所以环境的检查还是要细致,不能轻视。