一.问题描述
一朋友在生产库的一个节点上做了一个非常危险的操作,然后该节点的数据库监听出现了异常。
数据库是2个节点的RAC,版本11.2.0.3。
[grid@dave-db1 trace]$ lsnrctl status LISTENER
LSNRCTL for Linux: Version 11.2.0.3.0 - Production on 22-FEB-2013 18:40:27
Copyright (c) 1991, 2011, Oracle. All rights reserved.
Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=LISTENER)))
STATUS of the LISTENER
------------------------
Alias LISTENER
Version TNSLSNR for Linux: Version 11.2.0.3.0 - Production
Start Date 22-FEB-2013 18:00:03
Uptime 0 days 0 hr. 40 min. 23 sec
Trace Level off
Security ON: Local OS Authentication
SNMP OFF
Listener Parameter File /oracle/11.2.0.3/grid/network/admin/listener.ora
Listener Log File /oracle/app/grid/diag/tnslsnr/dave-db1/listener/alert/log.xml
Listening Endpoints Summary...
(DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=LISTENER)))
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=127.0.0.1)(PORT=1521)))
The listener supports no services
The command completed successfully
此处显示:The listener supports no services
[grid@dave-db1 trace]$ lsnrctl status LISTENER_SCAN1
LSNRCTL for Linux: Version 11.2.0.3.0 - Production on 22-FEB-2013 18:40:46
Copyright (c) 1991, 2011, Oracle. All rights reserved.
Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=LISTENER_SCAN1)))
STATUS of the LISTENER
------------------------
Alias LISTENER_SCAN1
Version TNSLSNR for Linux: Version 11.2.0.3.0 - Production
Start Date 22-FEB-2013 17:48:19
Uptime 0 days 0 hr. 52 min. 27 sec
Trace Level off
Security ON: Local OS Authentication
SNMP OFF
Listener Parameter File /oracle/11.2.0.3/grid/network/admin/listener.ora
Listener Log File /oracle/11.2.0.3/grid/log/diag/tnslsnr/dave-db1/listener_scan1/alert/log.xml
Listening Endpoints Summary...
(DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=LISTENER_SCAN1)))
The listener supports no services
The command completed successfully
手工注册后,问题依旧:
SQL>alter system register;
节点2上的监听是正常的:
[oracle@dave-db2 admin]$ lsnrctl status
LSNRCTL for Linux: Version 11.2.0.3.0 - Production on 22-FEB-2013 18:38:06
Copyright (c) 1991, 2011, Oracle. All rights reserved.
Connecting to (ADDRESS=(PROTOCOL=tcp)(HOST=)(PORT=1521))
STATUS of the LISTENER
------------------------
Alias LISTENER
Version TNSLSNR for Linux: Version 11.2.0.3.0 - Production
Start Date 22-FEB-2013 17:30:45
Uptime 0 days 1 hr. 7 min. 21 sec
Trace Level off
Security ON: Local OS Authentication
SNMP OFF
Listener Parameter File /oracle/11.2.0.3/grid/network/admin/listener.ora
Listener Log File /oracle/app/grid/diag/tnslsnr/dave-db2/listener/alert/log.xml
Listening Endpoints Summary...
(DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=LISTENER)))
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=127.0.0.1)(PORT=1521)))
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.106.14)(PORT=1521)))
Services Summary...
Service "+ASM" has 1 instance(s).
Instance "+ASM1", status READY, has 1 handler(s) for this service...
Service "DAVE" has 1 instance(s).
Instance "dave2", status READY, has 1 handler(s) for this service...
Service "oradb" has 1 instance(s).
Instance "dave2", status READY, has 1 handler(s) for this service...
Service "oradbXDB" has 1 instance(s).
Instance "dave2", status READY, has 1 handler(s) for this service...
The command completed successfully
--使用PS 查看监听的进程:
[grid@dave-db1 trace]$ ps -ef|grep LISTENER
grid 421 1 0 17:48 ? 00:00:00 /oracle/11.2.0.3/grid/bin/tnslsnr LISTENER_SCAN1 -inherit
grid 1435 1 0 18:00 ? 00:00:00 /oracle/11.2.0.3/grid/bin/tnslsnr LISTENER -inherit
grid 4482 1333 0 18:39 pts/1 00:00:00 grep LISTENER
grid 14380 1 0 2012 ? 00:02:56 /oracle/11.2.0.3/grid/bin/tnslsnr LISTENER -inherit
grid 15075 1 0 2012 ? 00:03:07 /oracle/11.2.0.3/grid/bin/tnslsnr LISTENER_SCAN1 –inherit
注意1:
这里是有问题的,因为监听的进程出现了2次。 我们继续看。
[grid@dave-db2 admin]$ crsctl stat resource -t
--------------------------------------------------------------------------------
NAME TARGET STATE SERVER STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.DATA.dg
ONLINE ONLINE dave-db1
ONLINE ONLINE dave-db2
ora.LISTENER.lsnr
ONLINE INTERMEDIATE dave-db1 Not All Endpoints Registered
ONLINE ONLINE dave-db2
ora.asm
ONLINE ONLINE dave-db1 Started
ONLINE ONLINE dave-db2 Started
ora.gsd
OFFLINE OFFLINE dave-db1
OFFLINE OFFLINE dave-db2
ora.net1.network
ONLINE ONLINE dave-db1
ONLINE ONLINE dave-db2
ora.ons
ONLINE ONLINE dave-db1
ONLINE ONLINE dave-db2
ora.registry.acfs
ONLINE ONLINE dave-db1
ONLINE ONLINE dave-db2
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
1 ONLINE INTERMEDIATE dave-db1 Not All Endpoints Registered
ora.cvu
1 ONLINE ONLINE dave-db1
ora.dave-db1.vip
1 ONLINE ONLINE dave-db1
ora.dave-db2.vip
1 ONLINE ONLINE dave-db2
ora.oc4j
1 ONLINE ONLINE dave-db1
ora.oradb.db
1 ONLINE ONLINE dave-db2 Open
2 ONLINE ONLINE dave-db1 Open
ora.scan1.vip
1 ONLINE ONLINE dave-db1
注意2:
通过crsctl stat resource –t命令,我们可以看到节点1上的ora.LISTENER.lsnr状态是 INTERMEDIATE,和Not All Endpoints Registered 。
[grid@dave-db2 admin]$ crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora.DATA.dg ora....up.type ONLINE ONLINE dave-db1
ora....ER.lsnr ora....er.type ONLINE ONLINE dave-db1
ora....N1.lsnr ora....er.type ONLINE ONLINE dave-db1
ora.asm ora.asm.type ONLINE ONLINE dave-db1
ora.cvu ora.cvu.type ONLINE ONLINE dave-db1
ora....SM2.asm application ONLINE ONLINE dave-db1
ora....B1.lsnr application ONLINE ONLINE dave-db1
ora....db1.gsd application OFFLINE OFFLINE
ora....db1.ons application ONLINE ONLINE dave-db1
ora....db1.vip ora....t1.type ONLINE ONLINE dave-db1
ora....SM1.asm application ONLINE ONLINE dave-db2
ora....B2.lsnr application ONLINE ONLINE dave-db2
ora....db2.gsd application OFFLINE OFFLINE
ora....db2.ons application ONLINE ONLINE dave-db2
ora....db2.vip ora....t1.type ONLINE ONLINE dave-db2
ora.gsd ora.gsd.type OFFLINE OFFLINE
ora....network ora....rk.type ONLINE ONLINE dave-db1
ora.oc4j ora.oc4j.type ONLINE ONLINE dave-db1
ora.ons ora.ons.type ONLINE ONLINE dave-db1
ora.oradb.db ora....se.type ONLINE ONLINE dave-db2
ora....ry.acfs ora....fs.type ONLINE ONLINE dave-db1
ora.scan1.vip ora....ip.type ONLINE ONLINE dave-db1
Listener.ora 文件内容如下:
[grid@dave-db1 admin]$ more listener.ora
LISTENER_SCAN1=(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=IPC)(KEY=LISTENER_SCAN1)))) # line added by Agent
LISTENER=(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=IPC)(KEY=LISTENER)))) # line added by Agent
ENABLE_GLOBAL_DYNAMIC_ENDPOINT_LISTENER=ON # line added by Agent
ENABLE_GLOBAL_DYNAMIC_ENDPOINT_LISTENER_SCAN1=ON # line added by Agent
二.可能原因
在MOS的文档中有说明:
Listener in INTERMEDIATE status with"Not All Endpoints Registered" [ID 1454439.1]
使用crsctl stat res -t,监听状态显示为INTERMEDIATE。
ora.LISTENER.lsnr
ONLINE ONLINE racdb1
ONLINE INTERMEDIATE racdb2 Not All Endpoints Registered
ora.LISTENER_SCAN1.lsnr
1 ONLINE INTERMEDIATE racdb2 Not All Endpoints Registered
可能的原因如下:
(1) The problem is caused byanother listener defined statically in listener.ora, using the same port and IPis running from the RDBMS ORACLE_HOME, started manually causing the defaultlistener starting from GRID_HOME can not register its endpoint. Hence the errorreported in dbca.
--在ORACLE_HOME的目录下配置了listener.ora,并且使用了相同的IP和端口。在手动的在GRID_HOME下启动默认的监听,就会导致不能注册endpoint.
ps -ef | grep tns:
grid 7222 1 0 Apr26 ? 00:00:13 /u01/app/11.2.0/grid/bin/tnslsnrLISTENER_SCAN1 -inherit
grid 7237 1 0 Apr26 ? 00:00:13 /u01/app/11.2.0/grid/bin/tnslsnrLISTENER -inherit
oracle 7354 1 0 Apr26 ? 00:00:01 /u02/app/oracle/product/11.2.0/db/bin/tnslsnrLISTENER -inherit
我们这里的情况和这里的很类似,不过我们是启动了2个grid下的监听。
(2) Another possible cause is thelistener or scan listener being defined manually in listener.ora, for example:
--另一种可能是listener 或者scan listener 被手工定义在listener.ora中.
LISTENER_SCAN3 =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = racnode1)(PORT = 1523))
)
LISTENER_SCAN1 =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = racnode1)(PORT = 1521))
注意,这种问题只存在Oracle 11gR2 以后的版本。因为11gR2以后,Oracle 的监听配置发生了变化。 这里了解一下即可。关于11gR2 RAC监听的配置,会在单独文章说明。
From 11.2onwards, all listeners should be runing from GRID_HOME, listener andlistener_scan<n> entry should be added automatically into listener.ora,no manual editing is required for TCP definition.
--从Oracle 11gR2 之后,所有的监听都配置在GRID_HOME下,listener 和listener_scan<n> entry 会自动配置到grid 安装用户的listener.ora文件中,而不需要手工配置相关的信息。
三.解决方法
MOS文档提供的解决方法:
1. Stop the listener running from RDBMSORACLE_HOME
$<RDBMS ORACLE_HOME>/bin/lsnrctl stopLISTENER
2. stop the listener from GRID_HOME
$<GRID_HOME>/bin/srvctl stop listener-n <node name>
$<GRID_HOME>/bin/srvctl stop scan_listener -i <scan#>
eg:
$<GRID_HOME>/bin/srvctl stop listener-n racnode1
$<GRID_HOME>/bin/srvctl stop scan_listener -i 1
If above command fails to stop the tnslsnrprocess, please use "kill -9 <pid of tnslsnr>" to stop theLISTENER and LISTENER_SCAN1 process.
3. remove any manually added LISTENER definitionfrom listener.ora if it exists
4. restart the LISTENER andLISTENER_SCAN1 from GRID_HOME
$<GRID_HOME>/bin/srvctl startlistener -n <node name>
$<GRID_HOME>/bin/srvctl start scan_listener -i <scan#>
5. check crsctl stat res -t output, theyboth should show ONLINE status now.
结合前面的说明,我们这里最大的可能是重复存在的监听进程。所以我们这里只需要kill 一组监听即可。 然后监听就正常了。
这里,我们kill 掉如下2个进程:
grid 14380 1 0 2012 ? 00:02:56 /oracle/11.2.0.3/grid/bin/tnslsnr LISTENER -inherit
grid 15075 1 0 2012 ? 00:03:07 /oracle/11.2.0.3/grid/bin/tnslsnr LISTENER_SCAN1 –inherit
kill之后,监听恢复正常。
[grid@dave-db1 ~]$ lsnrctl status
LSNRCTL for Linux: Version 11.2.0.3.0 - Production on 22-FEB-2013 19:11:06
Copyright (c) 1991, 2011, Oracle. All rights reserved.
Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=LISTENER)))
STATUS of the LISTENER
------------------------
Alias LISTENER
Version TNSLSNR for Linux: Version 11.2.0.3.0 - Production
Start Date 22-FEB-2013 19:05:06
Uptime 0 days 0 hr. 6 min. 0 sec
Trace Level off
Security ON: Local OS Authentication
SNMP OFF
Listener Parameter File /oracle/11.2.0.3/grid/network/admin/listener.ora
Listener Log File /oracle/app/grid/diag/tnslsnr/dave-db1/listener/alert/log.xml
Listening Endpoints Summary...
(DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=LISTENER)))
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=127.0.0.1)(PORT=1521)))
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.106.13)(PORT=1521)))
Services Summary...
Service "+ASM" has 1 instance(s).
Instance "+ASM2", status READY, has 1 handler(s) for this service...
Service "DAVE" has 1 instance(s).
Instance "dave1", status READY, has 1 handler(s) for this service...
Service "oradb" has 1 instance(s).
Instance "dave1", status READY, has 1 handler(s) for this service...
Service "oradbXDB" has 1 instance(s).
Instance "dave1", status READY, has 1 handler(s) for this service...
The command completed successfully
[grid@dave-db1 ~]$ crsctl stat resource -t
--------------------------------------------------------------------------------
NAME TARGET STATE SERVER STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.DATA.dg
ONLINE ONLINE dave-db1
ONLINE ONLINE dave-db2
ora.LISTENER.lsnr
ONLINE ONLINE dave-db1
ONLINE ONLINE dave-db2
ora.asm
ONLINE ONLINE dave-db1 Started
ONLINE ONLINE dave-db2 Started
ora.gsd
OFFLINE OFFLINE dave-db1
OFFLINE OFFLINE dave-db2
ora.net1.network
ONLINE ONLINE dave-db1
ONLINE ONLINE dave-db2
ora.ons
ONLINE ONLINE dave-db1
ONLINE ONLINE dave-db2
ora.registry.acfs
ONLINE ONLINE dave-db1
ONLINE ONLINE dave-db2
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
1 ONLINE ONLINE dave-db1
ora.cvu
1 ONLINE ONLINE dave-db1
ora.dave-db1.vip
1 ONLINE ONLINE dave-db1
ora.dave-db2.vip
1 ONLINE ONLINE dave-db2
ora.oc4j
1 ONLINE ONLINE dave-db1
ora.oradb.db
1 ONLINE ONLINE dave-db2 Open
2 ONLINE ONLINE dave-db1 Open
ora.scan1.vip
1 ONLINE ONLINE dave-db1
[grid@dave-db1 ~]$ lsnrctl status LISTENER_SCAN1
LSNRCTL for Linux: Version 11.2.0.3.0 - Production on 22-FEB-2013 19:12:39
Copyright (c) 1991, 2011, Oracle. All rights reserved.
Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=LISTENER_SCAN1)))
STATUS of the LISTENER
------------------------
Alias LISTENER_SCAN1
Version TNSLSNR for Linux: Version 11.2.0.3.0 - Production
Start Date 22-FEB-2013 17:48:19
Uptime 0 days 1 hr. 24 min. 20 sec
Trace Level off
Security ON: Local OS Authentication
SNMP OFF
Listener Parameter File /oracle/11.2.0.3/grid/network/admin/listener.ora
Listener Log File /oracle/11.2.0.3/grid/log/diag/tnslsnr/dave-db1/listener_scan1/alert/log.xml
Listening Endpoints Summary...
(DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=LISTENER_SCAN1)))
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.106.15)(PORT=1521)))
Services Summary...
Service "DAVE" has 2 instance(s).
Instance "dave1", status READY, has 1 handler(s) for this service...
Instance "dave2", status READY, has 1 handler(s) for this service...
Service "oradb" has 2 instance(s).
Instance "dave1", status READY, has 1 handler(s) for this service...
Instance "dave2", status READY, has 1 handler(s) for this service...
Service "oradbXDB" has 2 instance(s).
Instance "dave1", status READY, has 1 handler(s) for this service...
Instance "dave2", status READY, has 1 handler(s) for this service...
The command completed successfully
-