客户现场,电信相关7*24h业务,某省份数据库二节点alert不断产生ORA-7445报错,并抛出Trace dump文件、Trc文件。一节点alert并无7445相关报错,仅有core文件产生。crs、syslog等日志无明显告警。没有引起重启,没有引起业务方面的影响。只是不断报错,由于Oracle目录空间40G,为了防止数据库实例夯死,客户还是决定解决该报错。

客户主机HP-UX B.11.31 ia64,Oracle Release 10.2.0.5.0。

节点二alert:

Thu Nov 24 18:27:38 EAT 2016

Errors in file /oracle/admin/essbj/udump/essbj2_ora_17889.trc:

ORA-07445: exception encountered: core dump [kghsrch()+128] [SIGSEGV] [Address not mapped to object] [0xFFFFFFFFFFFFFFF0] [] []

Thu Nov 24 18:27:40 EAT 2016

Trace dumping is performing id=[cdmp_20161124182740]

Thu Nov 24 18:28:29 EAT 2016

Errors in file /oracle/admin/essbj/udump/essbj2_ora_18076.trc:

ORA-07445: exception encountered: core dump [kghsrch()+128] [SIGSEGV] [Address not mapped to object] [0xFFFFFFFFFFFFFFF0] [] []

Thu Nov 24 18:29:03 EAT 2016

Errors in file /oracle/admin/essbj/udump/essbj2_ora_19496.trc:

ORA-07445: exception encountered: core dump [kghsrch()+128] [SIGSEGV] [Address not mapped to object] [0xFFFFFFFFFFFFFFF0] [] []

Thu Nov 24 18:43:22 EAT 2016

Errors in file /oracle/admin/essbj/udump/essbj2_ora_12403.trc:

ORA-07445: exception encountered: core dump [kghsrch()+128] [SIGSEGV] [Address not mapped to object] [0xFFFFFFFFFFFFFFF0] [] []

Thu Nov 24 18:44:37 EAT 2016

Errors in file /oracle/admin/essbj/udump/essbj2_ora_14293.trc:

ORA-07445: exception encountered: core dump [kghsrch()+128] [SIGSEGV] [Address not mapped to object] [0xFFFFFFFFFFFFFFF0] [] []

Thu Nov 24 18:46:41 EAT 2016

Errors in file /oracle/admin/essbj/udump/essbj2_ora_17336.trc:

ORA-07445: exception encountered: core dump [kghsrch()+128] [SIGSEGV] [Address not mapped to object] [0xFFFFFFFFFFFFFFF0] [] []

Thu Nov 24 18:50:20 EAT 2016

Errors in file /oracle/admin/essbj/udump/essbj2_ora_20339.trc:

ORA-07445: exception encountered: core dump [kghsrch()+128] [SIGSEGV] [Address not mapped to object] [0xFFFFFFFFFFFFFFF0] [] []

Thu Nov 24 18:50:22 EAT 2016

Trace dumping is performing id=[cdmp_20161124185022]

Thu Nov 24 18:53:06 EAT 2016

Errors in file /oracle/admin/essbj/udump/essbj2_ora_23424.trc:

ORA-07445: exception encountered: core dump [kghsrch()+128] [SIGSEGV] [Address not mapped to object] [0xFFFFFFFFFFFFFFF0] [] []

Thu Nov 24 19:06:40 EAT 2016

Errors in file /oracle/admin/essbj/udump/essbj2_ora_18689.trc:

ORA-07445: exception encountered: core dump [kghsrch()+128] [SIGSEGV] [Address not mapped to object] [0xFFFFFFFFFFFFFFF0] [] []

Thu Nov 24 19:38:31 EAT 2016

Errors in file /oracle/admin/essbj/udump/essbj2_ora_6449.trc:

ORA-07445: exception encountered: core dump [kghsrch()+128] [SIGSEGV] [Address not mapped to object] [0xFFFFFFFFFFFFFFF0] [] []

Thu Nov 24 19:38:33 EAT 2016

Trace dumping is performing id=[cdmp_20161124193833]

Thu Nov 24 19:39:49 EAT 2016

Errors in file /oracle/admin/essbj/udump/essbj2_ora_10964.trc:

ORA-07445: exception encountered: core dump [kghsrch()+128] [SIGSEGV] [Address not mapped to object] [0xFFFFFFFFFFFFFFF0] [] []

Thu Nov 24 19:39:53 EAT 2016

Errors in file /oracle/admin/essbj/udump/essbj2_ora_9299.trc:

ORA-07445: exception encountered: core dump [kghsrch()+128] [SIGSEGV] [Address not mapped to object] [0xFFFFFFFFFFFFFFF0] [] []

Thu Nov 24 19:41:19 EAT 2016

Errors in file /oracle/admin/essbj/udump/essbj2_ora_12404.trc:

ORA-07445: exception encountered: core dump [kghsrch()+128] [SIGSEGV] [Address not mapped to object] [0xFFFFFFFFFFFFFFF0] [] []

Thu Nov 24 19:42:47 EAT 2016

Errors in file /oracle/admin/essbj/udump/essbj2_ora_13863.trc:

ORA-07445: exception encountered: core dump [kghsrch()+128] [SIGSEGV] [Address not mapped to object] [0xFFFFFFFFFFFFFFF0] [] []


节点一alert:

Fri Nov 25 11:16:05 EAT 2016

Trace dumping is performing id=[cdmp_20161125111604]

Fri Nov 25 12:00:05 EAT 2016

Thread 1 advanced to log sequence 34397 (LGWR switch)

  Current log# 8 seq# 34397 mem# 0: /vgbj03/oradata/essbj/vgbj03_1_rd81.log

  Current log# 8 seq# 34397 mem# 1: /vgbj04/oradata/essbj/vgbj04_1_rd82.log

Fri Nov 25 12:00:51 EAT 2016

Errors in file /oracle/admin/essbj/bdump/essbj1_j000_11783.trc:

ORA-12012: error on auto execute of job 281

ORA-00031: session marked for kill

ORA-06512: at "SYS.KILL_LONG_SQL", line 67

ORA-06512: at line 1

Fri Nov 25 12:17:15 EAT 2016

Trace dumping is performing id=[cdmp_20161125121714]

Fri Nov 25 12:37:21 EAT 2016

Errors in file /oracle/admin/essbj/bdump/essbj1_j000_5909.trc:

ORA-12012: error on auto execute of job 281

ORA-00031: session marked for kill

ORA-06512: at "SYS.KILL_LONG_SQL", line 67

ORA-06512: at line 1

Fri Nov 25 12:50:53 EAT 2016

Trace dumping is performing id=[cdmp_20161125125053]

Fri Nov 25 13:33:53 EAT 2016

Trace dumping is performing id=[cdmp_20161125133352]


节点二中产生的trc文件,进行了查看,指出了一条sql

ksedmp: internal or fatal error

ORA-07445: exception encountered: core dump [kghsrch()+128] [SIGSEGV] [Address not mapped to object] [0xFFFFFFFFFFFFFFF0] [] []

Current SQL statement for this session:

SELECT OCCUPY_STAFF_ID, OCCUPY_DEPART_ID     FROM TF_R_TEMPOCCUPY                      WHERE RES_NO = :1                       AND RES_TYPE_CODE = :2         AND RSRV_TAG1 = :3          


打算将这个sql单独执行,并且tail -f alert日志查看是否是该SQL问题引起的7445相关bug报错。因为该SQL有绑定变量。我们需要解析一下变量值,然后在执行该sql语句。

alter session set nls_date_format = 'yyyy-mm-dd,hh24:mi:ss';

set linesize 400

col sql_Id format a20

col name format a20

col datatype_string format a14

col value_string format a20

select sql_id,name, datatype_string, last_captured,value_string 

from v$sql_bind_capture where sql_id='&sql_id' order by LAST_CAPTURED,POSITION;


将变量带入SQL单独执行后,并未发生报错。在mos上查到跟trace文件中内容最相近的文档如下:

SMON Terminates With ORA-7445 [Kghsrch()+128] (文档 ID 1189894.1)
Bug 10036960 : ORA-07445[KGHSRCH] AND ORA-07445[KGHLKREMF] FOLLOWED BY INSTANCE CRASH.
ORA-07445 [kghsrch] Associated With Session Kills (文档 ID 787914.1)
Instance Termination with ORA-07445 [kghsrch()+144], ORA-00600 [kghfrh:ds] (文档 ID 2128933.1)
但由于目前数据库只是报错,并未造成实例down的后果,因此排除跟mos文档bug完全一致。且该系统本身数据库版本已经是10.2.0.5,bug中描述的fix版本是该版本。

wKiom1g_g6KT6gDVAACakxBte8E147.jpg-wh_50

wKioL1g_g6Pjb1ynAACnkDpgYzM670.jpg-wh_50

wKioL1g_g6ODO1s8AADYRGhnKkU297.jpg-wh_50

wKiom1g_g6TAROGjAAD2vJete_c663.jpg-wh_50

wKiom1g_g6XgHUf9AACUtcIum08307.jpg-wh_50


1.系统现有遇到的07445的问题与oracle官方提供的bug相关信息只有部分吻合;
2.其提供的解决方案与数据库当前环境并不吻合(如建议由10gR2升级至10.2.0.4或10.2.0.5,但现时生产数据库本身即为10.2.0.5);
3.数据库没有出现如bug提到的实例crash等,不影响业务的正常进行。

因此该异常的抛出可能与业务访问对象在内存中结构出现意外情况有关,且因为其不断生成trace文件以及cdump文件,建议客户重启抛出问题实例(实例2),再继续执行业务。然后查看是否有对应的7445报错产生。客户给出时间窗口后对该实例进行重启。

Tue Nov 29 21:38:37 EAT 2016

ALTER DATABASE OPEN

Block change tracking file is current.

Picked broadcast on commit scheme to generate SCNs

Tue Nov 29 21:38:38 EAT 2016

Sending CIC to internal enable redo thread

Tue Nov 29 21:38:38 EAT 2016

LGWR: STARTING ARCH PROCESSES

ARC0 started with pid=33, OS id=14097

Tue Nov 29 21:38:38 EAT 2016

ARC0: Archival started

ARC1: Archival started

LGWR: STARTING ARCH PROCESSES COMPLETE

ARC1 started with pid=34, OS id=14099

Tue Nov 29 21:38:38 EAT 2016

Thread 2 opened at log sequence 32257

  Current log# 12 seq# 32257 mem# 0: /vgbj03/oradata/essbj/vgbj03_1_rd121.log

  Current log# 12 seq# 32257 mem# 1: /vgbj04/oradata/essbj/vgbj04_1_rd122.log

Tue Nov 29 21:38:38 EAT 2016

ARC0: Becoming the 'no FAL' ARCH

ARC0: Becoming the 'no SRL' ARCH

Tue Nov 29 21:38:38 EAT 2016

Successful open of redo thread 2

Tue Nov 29 21:38:38 EAT 2016

ARC1: Becoming the heartbeat ARCH

Tue Nov 29 21:38:38 EAT 2016

Starting background process CTWR

CTWR started with pid=35, OS id=14101

Block change tracking service is active.

Tue Nov 29 21:38:38 EAT 2016

SMON: enabling cache recovery

Tue Nov 29 21:38:39 EAT 2016

Successfully onlined Undo Tablespace 4.

Tue Nov 29 21:38:39 EAT 2016

SMON: enabling tx recovery

Tue Nov 29 21:38:39 EAT 2016

Database Characterset is ZHS16GBK

Opening with internal Resource Manager plan

replication_dependency_tracking turned off (no async multimaster replication found)

Starting background process QMNC

QMNC started with pid=36, OS id=14127

Tue Nov 29 21:38:43 EAT 2016

Completed: ALTER DATABASE OPEN

Tue Nov 29 21:38:52 EAT 2016

ALTER SYSTEM SET service_names='essbj_db' SCOPE=MEMORY SID='essbj2';

Tue Nov 29 21:48:47 EAT 2016

ALTER SYSTEM SET service_names='essbj_db','essbj' SCOPE=MEMORY SID='essbj2';

Wed Nov 30 00:16:10 EAT 2016

ALTER SYSTEM ARCHIVE LOG

Wed Nov 30 00:16:10 EAT 2016

Thread 2 advanced to log sequence 32258 (LGWR switch)

  Current log# 4 seq# 32258 mem# 0: /vgbj04/oradata/essbj/vgbj04_1_rd41.log

  Current log# 4 seq# 32258 mem# 1: /vgbj03/oradata/essbj/vgbj03_1_rd42.log

Wed Nov 30 00:16:15 EAT 2016

ALTER SYSTEM ARCHIVE LOG

Wed Nov 30 00:16:17 EAT 2016

Thread 2 advanced to log sequence 32259 (LGWR switch)

  Current log# 3 seq# 32259 mem# 0: /vgbj04/oradata/essbj/vgbj04_1_rd31.log

  Current log# 3 seq# 32259 mem# 1: /vgbj03/oradata/essbj/vgbj03_1_rd32.log

Wed Nov 30 02:00:27 EAT 2016

Thread 2 advanced to log sequence 32260 (LGWR switch)

  Current log# 5 seq# 32260 mem# 0: /vgbj03/oradata/essbj/vgbj03_1_rd51.log

  Current log# 5 seq# 32260 mem# 1: /vgbj04/oradata/essbj/vgbj04_1_rd52.log

Wed Nov 30 06:13:11 EAT 2016

Thread 2 advanced to log sequence 32261 (LGWR switch)

  Current log# 10 seq# 32261 mem# 0: /vgbj03/oradata/essbj/vgbj03_1_rd101.log

  Current log# 10 seq# 32261 mem# 1: /vgbj04/oradata/essbj/vgbj04_1_rd102.log

再无任何相关报错。