20180626  2018-06-26 11:18【严重】主机名:zjhz-bjpaasb3,实例名:bjpaasb3,等待事件:library cache lock,平均每秒个数:1257,数据库存

当天数据库的短信告警,突然一下出现这么多library cache lock,于是登入数据库,查看数据库的日志。

等待事件 ORA-3136  SQLNET.INBOUND_CONNECT_TIMEOUT 导致高library cache_sql

 

在找到日志之后,再百度找到了盖总的博客,出现的问题都是一样的,以下就是盖总的博客。

 

最近一台新上线的Oracle10gR2数据库在警告日志文件中(alert.log)持续出现如下错误:

Tue Jul 18 23:09:22 2006
WARNING: inbound connection timed out (ORA-3136)
Tue Jul 18 23:09:23 2006
WARNING: inbound connection timed out (ORA-3136)
Tue Jul 18 23:09:25 2006
WARNING: inbound connection timed out (ORA-3136)
Tue Jul 18 23:09:30 2006
WARNING: inbound connection timed out (ORA-3136)
Tue Jul 18 23:12:15 2006
WARNING: inbound connection timed out (ORA-3136)

同时在sqlnet.log中记录了如下错误:

Fatal NI connect error 12170.
VERSION INFORMATION:
TNS for Linux: Version 10.2.0.2.0 - Production
Oracle Bequeath NT Protocol Adapter for Linux: Version 10.2.0.2.0 - Production
TCP/IP NT Protocol Adapter for Linux: Version 10.2.0.2.0 - Production
Time: 19-JUL-2006 11:25:26
Tracing not turned on.
Tns error struct:
ns main err code: 12535
TNS-12535: TNS:operation timed out
ns secondary err code: 12606
nt main err code: 0
nt secondary err code: 0
nt OS err code: 0
Client address: (ADDRESS=(PROTOCOL=tcp)(HOST=192.168.1.123)(PORT=58147))


这是和网络连接相关的一个错误,Metalink上给出了如下的解决方案:

1.set INBOUND_CONNECT_TIMEOUT_<listenername>=0 in listener.ora 
2. set SQLNET.INBOUND_CONNECT_TIMEOUT = 0 in sqlnet.ora of server.
3. stop and start both listener and database.
4. Now try to connect to DB and observe the behaviour

这里重起数据库和Listener我认为是没有必要的,我们reload一下Listner应该就可以了.

[oracle@order admin]$ lsnrctl
LSNRCTL for Linux: Version 10.2.0.2.0 - Production on 19-JUL-2006 15:26:33
Copyright (c) 1991, 2005, Oracle. All rights reserved.
Welcome to LSNRCTL, type "help" for information.
LSNRCTL> reload
Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=172.16.9.11)(PORT=1521)))
The command completed successfully
LSNRCTL> services
Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=172.16.9.11)(PORT=1521)))
Services Summary...
Service "order" has 2 instance(s).
Instance "order", status UNKNOWN, has 1 handler(s) for this service...
Handler(s):
"DEDICATED" established:0 refused:0
LOCAL SERVER
Instance "order", status READY, has 1 handler(s) for this service...
Handler(s):
"DEDICATED" established:0 refused:0 state:ready
LOCAL SERVER
The command completed successfullyLSNRCTL> show inbound_connect_timeout
Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=172.16.9.11)(PORT=1521)))
LISTENER parameter "inbound_connect_timeout" set to 0
The command completed successfully
LSNRCTL> exit

修改之后,观察了一段时间,目前正常.

关于SQLNET.INBOUND_CONNECT_TIMEOUT参数,Oracle建议修改该参数,以避免denial-of-service攻击.

引用一段Oracle文档说明如下:

 

SQLNET.INBOUND_CONNECT_TIMEOUT Purpose
Use the SQLNET.INBOUND_CONNECT_TIMEOUT parameter to specify the time, in seconds, for a client to connect with the database server and provide the necessary authentication information.

If the client fails to establish a connection and complete authentication in the time specified, then the database server terminates the connection. In addition, the database server logs the IP address of the client and an ORA-12170: TNS:Connect timeout occurred error message to the sqlnet.log file. The client receives either an ORA-12547: TNS:lost contact or an ORA-12637: Packet receive failed error message.

Without this parameter, a client connection to the database server can stay open indefinitely without authentication. Connections without authentication can introduce possible denial-of-service attacks, whereby malicious clients attempt to flood database servers with connect requests that consume resources.

To protect both the database server and the listener, Oracle Corporation recommends setting this parameter in combination with the INBOUND_CONNECT_TIMEOUT_listener_name parameter in the listener.ora file. When specifying values for these parameters, consider the following recommendations:

Set both parameters to an initial low value. 
Set the value of the INBOUND_CONNECT_TIMEOUT_listener_name parameter to a lower value than the SQLNET.INBOUND_CONNECT_TIMEOUT parameter. 
For example, you can set INBOUND_CONNECT_TIMEOUT_listener_name to 2 seconds and INBOUND_CONNECT_TIMEOUT parameter to 3 seconds. If clients are unable to complete connections within the specified time due to system or network delays that are normal for the particular environment, then increment the time as needed.

See Also: 
Oracle9i Net Services Administrator's Guide for information about configuring these parameters

Default
None

Example
SQLNET.INBOUND_CONNECT_TIMEOUT=3

 

 

下面自己在测试环境上还原了这个错误

   关于sqlnet.ora的参数SQLNET.INBOUND_CONNECT_TIMEOUT,它表示等待用户认证超时的时间,单位是秒,缺省值是60秒,如果用户认证超时了,服务器日志alert.log显示出错信息"WARNING: inbound connection timed out (ORA-3136)",sqlnet.log里面出现TNS-12535: TNS:operation timed out错误信息。

 

查看inbound_connect_timeout值

查看SQLNET.INBOUND_CONNECT_TIMEOUT的设置值,一般进入$ORACLE_HOME/network/admin下,查看sqlnet.ora参数文件即可。

LSNRCTL> show
The following operations are available after show
An asterisk (*) denotes a modifier or extended command:

rawmode displaymode
rules trc_file
trc_directory trc_level
log_file log_directory
log_status current_listener
inbound_connect_timeout startup_waittime
snmp_visible save_config_on_stop
dynamic_registration enable_global_dynamic_endpoint
oracle_home pid
connection_rate_limit valid_node_checking_registration
registration_invited_nodes registration_excluded_nodes

LSNRCTL> show inbound_connect_timeout
Connecting to (ADDRESS=(PROTOCOL=tcp)(HOST=)(PORT=1521))
LISTENER parameter "inbound_connect_timeout" set to 60
The command completed successfully

 

设置SQLNET.INBOUND_CONNECT_TIMEOUT值

 

我们首先设置SQLNET.INBOUND_CONNECT_TIMEOUT为30秒,这个参数修改后立即生效,不需要做任何其它操作。

 

[oracle@Database1 trace]$ vi sqlnet.ora
# sqlnet.ora Network Configuration File: /u01/app/oracle/product/10.2.0/db_1/network/admin/sqlnet.ora
# Generated by Oracle configuration tools.

NAMES.DIRECTORY_PATH= (TNSNAMES, EZCONNECT)
SQLNET.INBOUND_CONNECT_TIMEOUT=30

 

 现在模拟用户登入超时

[oracle@Database1 ~]$ sqlplus /@ORADB

SQL*Plus: Release 11.2.0.4.0 Production on Thu May 24 09:30:25 2018

Copyright (c) 1982, 2013, Oracle. All rights reserved.

ERROR:
ORA-01017: invalid username/password; logon denied


Enter user-name: ...........................................这里等待60秒

在oracle后台日志观察到
[oracle@Database1 trace]$ tail -f alert_oradb.log
Thu May 24 09:31:25 2018


***********************************************************************

Fatal NI connect error 12170.

VERSION INFORMATION:
TNS for Linux: Version 11.2.0.4.0 - Production
Oracle Bequeath NT Protocol Adapter for Linux: Version 11.2.0.4.0 - Production
TCP/IP NT Protocol Adapter for Linux: Version 11.2.0.4.0 - Production
Time: 24-MAY-2018 09:31:25
Tracing not turned on.
Tns error struct:
ns main err code: 12535

TNS-12535: TNS:operation timed out
ns secondary err code: 12606
nt main err code: 0
nt secondary err code: 0
nt OS err code: 0
Client address: (ADDRESS=(PROTOCOL=tcp)(HOST=127.0.0.1)(PORT=17263))
WARNING: inbound connection timed out (ORA-3136)

 

告警日志里面你会看到WARNING: inbound connection timed out (ORA-3136)错误。

 

总结:

 Oracle 11g 密码延迟认证是11g带来的新特性

在 Oracle 11g 中,为了提升安全性,Oracle 引入了『密码延迟验证』的新特性。这个特性的作用是,如果用户输入了错误的密码尝试登录,那么随着登录错误次数的增加,每次登录前验证的时间也会增加,以此减缓可能对于数据库重复的口令尝试攻击。

 

但是对于正常的系统,由于口令的更改,可能存在某些被遗漏的客户端,不断重复尝试,从而引起数据库内部长时间的 Library Cache Lock的等待,这种情形非常常见。

 

如果遇到这一类问题,可以通过Event 28401关闭这个特性,从而消除此类影响,以下命令将修改设置在参数文件中:

ALTER SYSTEM SET EVENT =
'28401 TRACE NAME CONTEXT FOREVER, LEVEL 1' SCOPE = SPFILE;

 

出现这类问题非常典型的AWR报告呈现如下,首先在 TOP 5 中,你可能看到显著的 Library Cache Lock 的等待,以下范例来自11.2.0.3.0版本的真实情况:

等待事件 ORA-3136  SQLNET.INBOUND_CONNECT_TIMEOUT 导致高library cache_sql_02

在这类情况下,时间模型 - Time Model 中会显示如下指标,其中 connection management call elapsed time 占据了主要的DB Time,这个等待直接表明是在建立数据库连接时产生的:

等待事件 ORA-3136  SQLNET.INBOUND_CONNECT_TIMEOUT 导致高library cache_数据库_03

这类问题,在Oracle的11g中是常见和确定的,在MOS上可以找到相应的记录:High 'library cache lock' Wait Time Due to Invalid Login Attempts(1309738.1)此外Oracle 11g开启了密码大小写验证,如果从Oracle 10g升级过来,需要特别的当心这个变化,通过初始化参数SEC_CASE_SENSITIVE_LOGON 可以来控制这个特性。