一.问题描述

 

一朋友在生产库的一个节点上做了一个非常危险的操作,然后该节点的数据库监听出现了异常。

 

数据库是2个节点的RAC,版本11.2.0.3。

 

 

[grid@dave-db1 trace]$ lsnrctl status LISTENER


LSNRCTL for Linux: Version 11.2.0.3.0 - Production on 22-FEB-2013 18:40:27


Copyright (c) 1991, 2011, Oracle.  All rights reserved.


Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=LISTENER)))

STATUS of the LISTENER

------------------------

Alias                     LISTENER

Version                   TNSLSNR for Linux: Version 11.2.0.3.0 - Production

Start Date                22-FEB-2013 18:00:03

Uptime                    0 days 0 hr. 40 min. 23 sec

Trace Level               off

Security                  ON: Local OS Authentication

SNMP                      OFF

Listener Parameter File   /oracle/11.2.0.3/grid/network/admin/listener.ora

Listener Log File         /oracle/app/grid/diag/tnslsnr/dave-db1/listener/alert/log.xml

Listening Endpoints Summary...

  (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=LISTENER)))

  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=127.0.0.1)(PORT=1521)))

The listener supports no services

The command completed successfully

 

此处显示:The listener supports no services

 

 

[grid@dave-db1 trace]$ lsnrctl status LISTENER_SCAN1


LSNRCTL for Linux: Version 11.2.0.3.0 - Production on 22-FEB-2013 18:40:46


Copyright (c) 1991, 2011, Oracle.  All rights reserved.


Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=LISTENER_SCAN1)))

STATUS of the LISTENER

------------------------

Alias                     LISTENER_SCAN1

Version                   TNSLSNR for Linux: Version 11.2.0.3.0 - Production

Start Date                22-FEB-2013 17:48:19

Uptime                    0 days 0 hr. 52 min. 27 sec

Trace Level               off

Security                  ON: Local OS Authentication

SNMP                      OFF

Listener Parameter File   /oracle/11.2.0.3/grid/network/admin/listener.ora

Listener Log File         /oracle/11.2.0.3/grid/log/diag/tnslsnr/dave-db1/listener_scan1/alert/log.xml

Listening Endpoints Summary...

  (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=LISTENER_SCAN1)))

The listener supports no services

The command completed successfully

 

手工注册后,问题依旧:

SQL>alter system register;

 

 

节点2上的监听是正常的:

[oracle@dave-db2 admin]$ lsnrctl status


LSNRCTL for Linux: Version 11.2.0.3.0 - Production on 22-FEB-2013 18:38:06


Copyright (c) 1991, 2011, Oracle.  All rights reserved.


Connecting to (ADDRESS=(PROTOCOL=tcp)(HOST=)(PORT=1521))

STATUS of the LISTENER

------------------------

Alias                     LISTENER

Version                   TNSLSNR for Linux: Version 11.2.0.3.0 - Production

Start Date                22-FEB-2013 17:30:45

Uptime                    0 days 1 hr. 7 min. 21 sec

Trace Level               off

Security                  ON: Local OS Authentication

SNMP                      OFF

Listener Parameter File   /oracle/11.2.0.3/grid/network/admin/listener.ora

Listener Log File         /oracle/app/grid/diag/tnslsnr/dave-db2/listener/alert/log.xml

Listening Endpoints Summary...

  (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=LISTENER)))

  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=127.0.0.1)(PORT=1521)))

  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.106.14)(PORT=1521)))

Services Summary...

Service "+ASM" has 1 instance(s).

  Instance "+ASM1", status READY, has 1 handler(s) for this service...

Service "DAVE" has 1 instance(s).

  Instance "dave2", status READY, has 1 handler(s) for this service...

Service "oradb" has 1 instance(s).

  Instance "dave2", status READY, has 1 handler(s) for this service...

Service "oradbXDB" has 1 instance(s).

  Instance "dave2", status READY, has 1 handler(s) for this service...

The command completed successfully

 

--使用PS  查看监听的进程:

[grid@dave-db1 trace]$ ps -ef|grep LISTENER

grid       421     1  0 17:48 ?        00:00:00 /oracle/11.2.0.3/grid/bin/tnslsnr LISTENER_SCAN1 -inherit

grid      1435     1  0 18:00 ?        00:00:00 /oracle/11.2.0.3/grid/bin/tnslsnr LISTENER -inherit

grid      4482  1333  0 18:39 pts/1    00:00:00 grep LISTENER

grid     14380     1  0  2012 ?        00:02:56 /oracle/11.2.0.3/grid/bin/tnslsnr LISTENER -inherit

grid     15075     1  0  2012 ?        00:03:07 /oracle/11.2.0.3/grid/bin/tnslsnr LISTENER_SCAN1 –inherit

 

注意1:

这里是有问题的,因为监听的进程出现了2次。 我们继续看。

 

 

[grid@dave-db2 admin]$ crsctl stat resource -t

--------------------------------------------------------------------------------

NAME           TARGET  STATE        SERVER                   STATE_DETAILS       

--------------------------------------------------------------------------------

Local Resources

--------------------------------------------------------------------------------

ora.DATA.dg

               ONLINE  ONLINE       dave-db1                                   

               ONLINE  ONLINE       dave-db2                                   

ora.LISTENER.lsnr

               ONLINE  INTERMEDIATE dave-db1               Not All Endpoints Registered           

               ONLINE  ONLINE       dave-db2                                   

ora.asm

               ONLINE  ONLINE       dave-db1               Started             

               ONLINE  ONLINE       dave-db2               Started             

ora.gsd

               OFFLINE OFFLINE      dave-db1                                   

               OFFLINE OFFLINE      dave-db2                                   

ora.net1.network

               ONLINE  ONLINE       dave-db1                                   

               ONLINE  ONLINE       dave-db2                                   

ora.ons

               ONLINE  ONLINE       dave-db1                                   

               ONLINE  ONLINE       dave-db2                                   

ora.registry.acfs

               ONLINE  ONLINE       dave-db1                                   

               ONLINE  ONLINE       dave-db2                                   

--------------------------------------------------------------------------------

Cluster Resources

--------------------------------------------------------------------------------

ora.LISTENER_SCAN1.lsnr

      1        ONLINE  INTERMEDIATE dave-db1               Not All Endpoints Registered           

ora.cvu

      1        ONLINE  ONLINE       dave-db1                                   

ora.dave-db1.vip

      1        ONLINE  ONLINE       dave-db1                                   

ora.dave-db2.vip

      1        ONLINE  ONLINE       dave-db2                                   

ora.oc4j

      1        ONLINE  ONLINE       dave-db1                                   

ora.oradb.db

      1        ONLINE  ONLINE       dave-db2               Open                

      2        ONLINE  ONLINE       dave-db1               Open                

ora.scan1.vip

      1        ONLINE  ONLINE       dave-db1

 

注意2:

通过crsctl stat resource –t命令,我们可以看到节点1上的ora.LISTENER.lsnr状态是 INTERMEDIATE,和Not All Endpoints Registered  。

 

 

[grid@dave-db2 admin]$ crs_stat -t

Name           Type           Target    State     Host        

------------------------------------------------------------

ora.DATA.dg    ora....up.type ONLINE    ONLINE    dave-db1  

ora....ER.lsnr ora....er.type ONLINE    ONLINE    dave-db1  

ora....N1.lsnr ora....er.type ONLINE    ONLINE    dave-db1  

ora.asm        ora.asm.type   ONLINE    ONLINE    dave-db1  

ora.cvu        ora.cvu.type   ONLINE    ONLINE    dave-db1  

ora....SM2.asm application    ONLINE    ONLINE    dave-db1  

ora....B1.lsnr application    ONLINE    ONLINE    dave-db1  

ora....db1.gsd application    OFFLINE   OFFLINE               

ora....db1.ons application    ONLINE    ONLINE    dave-db1  

ora....db1.vip ora....t1.type ONLINE    ONLINE    dave-db1  

ora....SM1.asm application    ONLINE    ONLINE    dave-db2  

ora....B2.lsnr application    ONLINE    ONLINE    dave-db2  

ora....db2.gsd application    OFFLINE   OFFLINE               

ora....db2.ons application    ONLINE    ONLINE    dave-db2  

ora....db2.vip ora....t1.type ONLINE    ONLINE    dave-db2  

ora.gsd        ora.gsd.type   OFFLINE   OFFLINE               

ora....network ora....rk.type ONLINE    ONLINE    dave-db1  

ora.oc4j       ora.oc4j.type  ONLINE    ONLINE    dave-db1  

ora.ons        ora.ons.type   ONLINE    ONLINE    dave-db1  

ora.oradb.db   ora....se.type ONLINE    ONLINE    dave-db2  

ora....ry.acfs ora....fs.type ONLINE    ONLINE    dave-db1  

ora.scan1.vip  ora....ip.type ONLINE    ONLINE    dave-db1 

 

Listener.ora 文件内容如下:

[grid@dave-db1 admin]$ more listener.ora

LISTENER_SCAN1=(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=IPC)(KEY=LISTENER_SCAN1))))                # line added by Agent

LISTENER=(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=IPC)(KEY=LISTENER))))            # line added by Agent

ENABLE_GLOBAL_DYNAMIC_ENDPOINT_LISTENER=ON              # line added by Agent

ENABLE_GLOBAL_DYNAMIC_ENDPOINT_LISTENER_SCAN1=ON                # line added by Agent

 

 

二.可能原因

 

在MOS的文档中有说明:

Listener in INTERMEDIATE status with"Not All Endpoints Registered" [ID 1454439.1]

 

使用crsctl stat res -t,监听状态显示为INTERMEDIATE。

 

ora.LISTENER.lsnr

      ONLINE   ONLINE              racdb1 

      ONLINE   INTERMEDIATE        racdb2   Not All Endpoints Registered

ora.LISTENER_SCAN1.lsnr

   1   ONLINE  INTERMEDIATE        racdb2    Not All Endpoints Registered

 

可能的原因如下:

(1)  The problem is caused byanother listener defined statically in listener.ora, using the same port and IPis running from the RDBMS ORACLE_HOME, started manually causing the defaultlistener starting from GRID_HOME can not register its endpoint. Hence the errorreported in dbca.

--在ORACLE_HOME的目录下配置了listener.ora,并且使用了相同的IP和端口。在手动的在GRID_HOME下启动默认的监听,就会导致不能注册endpoint.

 

ps -ef | grep tns:


grid   7222 1 0 Apr26 ? 00:00:13 /u01/app/11.2.0/grid/bin/tnslsnrLISTENER_SCAN1 -inherit

grid   7237 1 0 Apr26 ? 00:00:13 /u01/app/11.2.0/grid/bin/tnslsnrLISTENER -inherit

oracle 7354 1 0 Apr26 ? 00:00:01 /u02/app/oracle/product/11.2.0/db/bin/tnslsnrLISTENER -inherit

 

我们这里的情况和这里的很类似,不过我们是启动了2个grid下的监听。

 

(2)  Another possible cause is thelistener or scan listener being defined manually in listener.ora, for example:

    --另一种可能是listener 或者scan listener 被手工定义在listener.ora中.

 

LISTENER_SCAN3 =                          

 (DESCRIPTION =

   (ADDRESS = (PROTOCOL = TCP)(HOST = racnode1)(PORT = 1523))

 )


LISTENER_SCAN1 =                            

 (DESCRIPTION =

   (ADDRESS = (PROTOCOL = TCP)(HOST = racnode1)(PORT = 1521))


 

注意,这种问题只存在Oracle 11gR2 以后的版本。因为11gR2以后,Oracle 的监听配置发生了变化。 这里了解一下即可。关于11gR2 RAC监听的配置,会在单独文章说明。

 

From 11.2onwards, all listeners should be runing from GRID_HOME, listener andlistener_scan<n> entry should be added automatically into listener.ora,no manual editing is required for TCP definition.

    --从Oracle 11gR2 之后,所有的监听都配置在GRID_HOME下,listener 和listener_scan<n> entry 会自动配置到grid 安装用户的listener.ora文件中,而不需要手工配置相关的信息。

 

三.解决方法

 

MOS文档提供的解决方法:

1. Stop the listener running from RDBMSORACLE_HOME

$<RDBMS ORACLE_HOME>/bin/lsnrctl stopLISTENER

2. stop the listener from GRID_HOME

$<GRID_HOME>/bin/srvctl stop listener-n <node name>

$<GRID_HOME>/bin/srvctl stop scan_listener -i <scan#>

eg:

$<GRID_HOME>/bin/srvctl stop listener-n racnode1

$<GRID_HOME>/bin/srvctl stop scan_listener -i 1

If above command fails to stop the tnslsnrprocess, please use "kill -9 <pid of tnslsnr>" to stop theLISTENER and LISTENER_SCAN1 process.

3. remove any manually added LISTENER definitionfrom listener.ora if it exists

4. restart the LISTENER andLISTENER_SCAN1  from GRID_HOME

$<GRID_HOME>/bin/srvctl startlistener -n <node name>

$<GRID_HOME>/bin/srvctl start scan_listener -i <scan#>

5. check crsctl stat res -t output, theyboth should show ONLINE status now.

 

 

结合前面的说明,我们这里最大的可能是重复存在的监听进程。所以我们这里只需要kill 一组监听即可。 然后监听就正常了。

 

这里,我们kill 掉如下2个进程:

grid     14380     1  0  2012 ?        00:02:56 /oracle/11.2.0.3/grid/bin/tnslsnr LISTENER -inherit

grid     15075     1  0  2012 ?        00:03:07 /oracle/11.2.0.3/grid/bin/tnslsnr LISTENER_SCAN1 –inherit

 

kill之后,监听恢复正常。

 

[grid@dave-db1 ~]$ lsnrctl status


LSNRCTL for Linux: Version 11.2.0.3.0 - Production on 22-FEB-2013 19:11:06


Copyright (c) 1991, 2011, Oracle.  All rights reserved.


Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=LISTENER)))

STATUS of the LISTENER

------------------------

Alias                     LISTENER

Version                   TNSLSNR for Linux: Version 11.2.0.3.0 - Production

Start Date                22-FEB-2013 19:05:06

Uptime                    0 days 0 hr. 6 min. 0 sec

Trace Level               off

Security                  ON: Local OS Authentication

SNMP                      OFF

Listener Parameter File   /oracle/11.2.0.3/grid/network/admin/listener.ora

Listener Log File         /oracle/app/grid/diag/tnslsnr/dave-db1/listener/alert/log.xml

Listening Endpoints Summary...

  (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=LISTENER)))

  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=127.0.0.1)(PORT=1521)))

  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.106.13)(PORT=1521)))

Services Summary...

Service "+ASM" has 1 instance(s).

  Instance "+ASM2", status READY, has 1 handler(s) for this service...

Service "DAVE" has 1 instance(s).

  Instance "dave1", status READY, has 1 handler(s) for this service...

Service "oradb" has 1 instance(s).

  Instance "dave1", status READY, has 1 handler(s) for this service...

Service "oradbXDB" has 1 instance(s).

  Instance "dave1", status READY, has 1 handler(s) for this service...

The command completed successfully

 

 

 

[grid@dave-db1 ~]$ crsctl stat resource -t

--------------------------------------------------------------------------------

NAME           TARGET  STATE        SERVER                   STATE_DETAILS       

--------------------------------------------------------------------------------

Local Resources

--------------------------------------------------------------------------------

ora.DATA.dg

               ONLINE  ONLINE       dave-db1                                   

               ONLINE  ONLINE       dave-db2                                   

ora.LISTENER.lsnr

               ONLINE  ONLINE       dave-db1                                   

               ONLINE  ONLINE       dave-db2                                   

ora.asm

               ONLINE  ONLINE       dave-db1               Started             

               ONLINE  ONLINE       dave-db2               Started             

ora.gsd

               OFFLINE OFFLINE      dave-db1                                   

               OFFLINE OFFLINE      dave-db2                                   

ora.net1.network

               ONLINE  ONLINE       dave-db1                                   

               ONLINE  ONLINE       dave-db2                                   

ora.ons

               ONLINE  ONLINE       dave-db1                                   

               ONLINE  ONLINE       dave-db2                                   

ora.registry.acfs

               ONLINE  ONLINE       dave-db1                                   

               ONLINE  ONLINE       dave-db2                                   

--------------------------------------------------------------------------------

Cluster Resources

--------------------------------------------------------------------------------

ora.LISTENER_SCAN1.lsnr

      1        ONLINE  ONLINE       dave-db1                                   

ora.cvu

      1        ONLINE  ONLINE       dave-db1                                   

ora.dave-db1.vip

      1        ONLINE  ONLINE       dave-db1                                   

ora.dave-db2.vip

      1        ONLINE  ONLINE       dave-db2                                   

ora.oc4j

      1        ONLINE  ONLINE       dave-db1                                   

ora.oradb.db

      1        ONLINE  ONLINE       dave-db2               Open                

      2        ONLINE  ONLINE       dave-db1               Open                

ora.scan1.vip

      1        ONLINE  ONLINE       dave-db1                              

 

 

[grid@dave-db1 ~]$ lsnrctl status LISTENER_SCAN1


LSNRCTL for Linux: Version 11.2.0.3.0 - Production on 22-FEB-2013 19:12:39


Copyright (c) 1991, 2011, Oracle.  All rights reserved.


Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=LISTENER_SCAN1)))

STATUS of the LISTENER

------------------------

Alias                     LISTENER_SCAN1

Version                   TNSLSNR for Linux: Version 11.2.0.3.0 - Production

Start Date                22-FEB-2013 17:48:19

Uptime                    0 days 1 hr. 24 min. 20 sec

Trace Level               off

Security                  ON: Local OS Authentication

SNMP                      OFF

Listener Parameter File   /oracle/11.2.0.3/grid/network/admin/listener.ora

Listener Log File         /oracle/11.2.0.3/grid/log/diag/tnslsnr/dave-db1/listener_scan1/alert/log.xml

Listening Endpoints Summary...

  (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=LISTENER_SCAN1)))

  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.106.15)(PORT=1521)))

Services Summary...

Service "DAVE" has 2 instance(s).

  Instance "dave1", status READY, has 1 handler(s) for this service...

  Instance "dave2", status READY, has 1 handler(s) for this service...

Service "oradb" has 2 instance(s).

  Instance "dave1", status READY, has 1 handler(s) for this service...

  Instance "dave2", status READY, has 1 handler(s) for this service...

Service "oradbXDB" has 2 instance(s).

  Instance "dave1", status READY, has 1 handler(s) for this service...

  Instance "dave2", status READY, has 1 handler(s) for this service...

The command completed successfully