客户报表库,HP-ux B11.31 IA64 FOR Oracle 10.2.0.5.0 RAC系统短信过滤alert日志告警
ORA-00604: error occurred at recursive SQL level 1
ORA-04031:unable to allocate 4120 bytes of shared memory ("shared pool","select f.file#, f.block#, f....","Typecheck","kgghteInit")
询问客户SGA、share pool、主机资源等相关情况
SQL> show parameter sga
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
lock_sga boolean TRUE
pre_page_sga boolean FALSE
sga_max_size big integer 60G
sga_target big integer 0
SQL> show parameter pool
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
buffer_pool_keep string
buffer_pool_recycle string
global_context_pool_size string
java_pool_size big integer 512M
large_pool_size big integer 512M
olap_page_pool_size big integer 0
shared_pool_reserved_size big integer 644245094
shared_pool_size big integer 12G
streams_pool_size big integer 416M
SGA总大小60G,share pool总大小12G。有了这个直观的感受,随即将4031收集相关信息的脚本4031_OK-ForAll.sql发给客户,收集现在内存使用情况,脚本内容如下:
/**********************************************************************
* File: 4031.sql
* Date: 2012/01/1
*
* Modifications:
* 2012/02/12 Changed v1
*********************************************************************/
spool spinfo.txt
SET PAGESIZE 1024
SET LINESIZE 2000
set echo off;
set feedback off;
set heading on;
set trimout on;
set trimspool on;
COL BYTES FORMAT 999999999999999
COL CURRENT_SIZE FORMAT 999999999999999
/* Script Run TimeStamp */
set serveroutput on;
exec dbms_output.put_line('Script Run TimeStamp');
select to_char(sysdate, 'dd-MON-yyyy hh24:mi:ss') "Script Run TimeStamp" from dual;
set serveroutput on;
exec dbms_output.put_line('Instance Startup Time');
/*Instance Startup time */
select to_char(startup_time, 'dd-MON-yyyy hh24:mi:ss') "Instance Startup Time" from v$instance;
/* shared pool related hidden parameter */
set serveroutput on;
exec dbms_output.put_line('shared pool related hidden parameter ');
col name format a40
col value format a80;
select nam.ksppinm NAME,val.KSPPSTVL VALUE from x$ksppi nam,x$ksppsv val where nam.indx = val.indx and nam.ksppinm like '%shared%' order by 1;
/* SUB Pool Number */
set serveroutput on;
exec dbms_output.put_line('SUB Pool Number ');
col 'Parameter' format a40
col 'Session Value' format a40;
col 'Instance Value' format a40;
select a.ksppinm "Parameter",
b.ksppstvl "Session Value",
c.ksppstvl "Instance Value"
from sys.x$ksppi a, sys.x$ksppcv b, sys.x$ksppsv c
where a.indx = b.indx and a.indx = c.indx
and a.ksppinm like '%_kghdsidx_count%';
/* Each Subpool Size */
set serveroutput on;
exec dbms_output.put_line('Each Subpool Size');
select ksmchidx poolnumer , sum(ksmchsiz) poolsize
from x$ksmsp
group by ksmchidx ;
/* Researved Shared Pool 4031 information */
set serveroutput on;
exec dbms_output.put_line('Researved Shared Pool 4031 information');
select REQUEST_FAILURES, LAST_FAILURE_SIZE from V$SHARED_POOL_RESERVED;
/* Reaserved Shared Pool Reserved 4031 information */
set serveroutput on;
exec dbms_output.put_line('Reaserved Shared Pool 4031 information');
select REQUESTS, REQUEST_MISSES, free_space, avg_free_size, free_count, max_free_size from V$SHARED_POOL_RESERVED;
/* Current SGA Buffer & Pool sizes */
set serveroutput on;
exec dbms_output.put_line('Current SGA Buffer Pool sizes');
select component, current_size from v$sga_dynamic_components;
/* Shared Pool Memory Allocations by Size */
set serveroutput on;
exec dbms_output.put_line('Shared Pool Memory Allocations by Size');
select name, bytes from v$sgastat
where pool = 'shared pool' and (bytes > 999999 or name = 'free memory')
order by bytes desc;
set serveroutput on;
exec dbms_output.put_line('show component of shared pool which is bigger than 10MB');
select name, round((bytes/1024/1024),0) "more than 10" from v$sgastat where pool='shared pool' and bytes > 10000000 order by bytes desc;
select sum(bytes) "SHARED POOL TOTAL SIZE" from v$sgastat where pool='shared pool';
/* Total Free of Shared Pool */
set serveroutput on;
exec dbms_output.put_line('Total Free(not Free) of Shared Pool ');
COL 'Total Shared Pool Usage' FORMAT 999999999999999
select sum(bytes)/1024/1024 "Free MB in Shared Pool" from v$sgastat where pool = 'shared pool' and name = 'free memory';
select sum(bytes) "Not Free MB Shared Pool" from v$sgastat where pool = 'shared pool' and name != 'free memory';
/* current KGLH* usage */
set serveroutput on;
exec dbms_output.put_line('current KGLH* usage');
select name, bytes from v$sgastat where pool = 'shared pool' and name in ('KGLHD','KGHL0');
/* Hisotry KGLH* usage */
set serveroutput on;
exec dbms_output.put_line('Hisotry KGLH* usage');
select bytes/1024/1024 , s.snap_id, begin_interval_time START_TIME
from dba_hist_sgastat g, dba_hist_snapshot s
where name='KGLHD'
and pool='shared pool'
and trunc(begin_interval_time) >= '30-DEC-2011'
and s.snap_id = g.snap_id
order by 2;
set serveroutput on;
exec dbms_output.put_line('Hisotry KGLH0* usage');
select bytes/1024/1024 , s.snap_id, begin_interval_time START_TIME
from dba_hist_sgastat g, dba_hist_snapshot s
where name='KGLH0'
and pool='shared pool'
and trunc(begin_interval_time) >= '30-DEC-2011'
and s.snap_id = g.snap_id
order by 2;
/* History of Shared pool allocations in a speciifed Day*/
set serveroutput on;
exec dbms_output.put_line('history of Shared pool allocations in a speciifed Day');
col name format a30
select n,
max(decode(to_char(begin_interval_time, 'hh24'), 1,bytes, null)) "1",
max(decode(to_char(begin_interval_time, 'hh24'), 2,bytes, null)) "2",
max(decode(to_char(begin_interval_time, 'hh24'), 3,bytes, null)) "3",
max(decode(to_char(begin_interval_time, 'hh24'), 4,bytes, null)) "4",
max(decode(to_char(begin_interval_time, 'hh24'), 5,bytes, null)) "5",
max(decode(to_char(begin_interval_time, 'hh24'), 6,bytes, null)) "6",
max(decode(to_char(begin_interval_time, 'hh24'), 7,bytes, null)) "7",
max(decode(to_char(begin_interval_time, 'hh24'), 8,bytes, null)) "8",
max(decode(to_char(begin_interval_time, 'hh24'), 9,bytes, null)) "9",
max(decode(to_char(begin_interval_time, 'hh24'), 10,bytes, null)) "10",
max(decode(to_char(begin_interval_time, 'hh24'), 11,bytes, null)) "11",
max(decode(to_char(begin_interval_time, 'hh24'), 12,bytes, null)) "12",
max(decode(to_char(begin_interval_time, 'hh24'), 13,bytes, null)) "13",
max(decode(to_char(begin_interval_time, 'hh24'), 14,bytes, null)) "14",
max(decode(to_char(begin_interval_time, 'hh24'), 15,bytes, null)) "15",
max(decode(to_char(begin_interval_time, 'hh24'), 16,bytes, null)) "16",
max(decode(to_char(begin_interval_time, 'hh24'), 17,bytes, null)) "17",
max(decode(to_char(begin_interval_time, 'hh24'), 18,bytes, null)) "18",
max(decode(to_char(begin_interval_time, 'hh24'), 19,bytes, null)) "19",
max(decode(to_char(begin_interval_time, 'hh24'), 20,bytes, null)) "20",
max(decode(to_char(begin_interval_time, 'hh24'), 21,bytes, null)) "21",
max(decode(to_char(begin_interval_time, 'hh24'), 22,bytes, null)) "22",
max(decode(to_char(begin_interval_time, 'hh24'), 23,bytes, null)) "23",
max(decode(to_char(begin_interval_time, 'hh24'), 24,bytes, null)) "24"
from (select '"'||name||'"' n, begin_interval_time, bytes from dba_hist_sgastat a, dba_hist_snapshot b
where pool='shared pool' and a.snap_id=b.snap_id
and to_char(begin_interval_time,'hh24:mi') between '01:00' and '24:00'
and to_char(begin_interval_time,'dd-mon') = to_char(sysdate-1, 'dd-mon'))
group by n;
/* Each Subpool sumary usage for free memory , may slow ,it depends on custoemr database workload */
set serveroutput on;
exec dbms_output.put_line('Each Subpool sumary usage for free memory');
col subpool format a20
col name format a40
SELECT
subpool
, name
, SUM(bytes)
, ROUND(SUM(bytes)/1048576,2) MB
FROM (
SELECT
'shared pool ('||DECODE(TO_CHAR(ksmdsidx),'0','0 - Unused',ksmdsidx)||'):' subpool
, ksmssnam name
, ksmsslen bytes
FROM
x$ksmss
WHERE
ksmsslen > 0
AND LOWER(ksmssnam) LIKE LOWER('%free memory%')
)
GROUP BY
subpool
, name
ORDER BY
subpool ASC
, SUM(bytes) DESC ;
/* Memory fragment and chunk allocation like 0-1K,1-2K, may slow ,it depends on custoemr database workload */
set serveroutput on;
exec dbms_output.put_line('Memory fragment and chunk allocation like 0-1K,1-2K');
col SubPool format 999
col mb format 999,999
col name heading "Name"
SELECT ksmchidx "SubPool",
'sga heap(' || ksmchidx || ',0)' sga_heap,
ksmchcom chunkcomment,
DECODE(ROUND(ksmchsiz / 1000),
0,
'0-1K',
1,
'1-2K',
2,
'2-3K',
3,
'3-4K',
4,
'4-5K',
5,
'5-6k',
6,
'6-7k',
7,
'7-8k',
8,
'8-9k',
9,
'9-10k',
'> 10K'
) "size",
COUNT(*),
ksmchcls status,
SUM(ksmchsiz) BYTES
FROM x$ksmsp
WHERE ksmchcom = 'free memory'
GROUP BY ksmchidx,
ksmchcls,
'sga heap(' || ksmchidx || ',0)',
ksmchcom,
ksmchcls,
DECODE(ROUND(ksmchsiz / 1000),
0,
'0-1K',
1,
'1-2K',
2,
'2-3K',
3,
'3-4K',
4,
'4-5K',
5,
'5-6k',
6,
'6-7k',
7,
'7-8k',
8,
'8-9k',
9,
'9-10k',
'> 10K');
select to_char(sysdate,'YYYY-MM-DD HH24:MI:SS') "Script END TimeStamp" from dual;
spool off;
执行后结果为:
NAME VALUE
---------------------------------------- --------------------------------------------------------------------------------
__shared_pool_size 12884901888
_all_shared_dblinks
_dm_max_shared_pool_pct 1
_enable_shared_pool_durations FALSE
_io_shared_pool_size 4194304
_shared_pool_max_size 0
_shared_pool_minsize_on FALSE
_shared_pool_reserved_min_alloc 4400
_shared_pool_reserved_pct 5
_shared_server_spare_param1
_shared_server_spare_param2
_shared_server_spare_param3
_skgxp_shared_port 0
hi_shared_memory_address 0
max_shared_servers
shared_memory_address 0
shared_pool_reserved_size 644245094
shared_pool_size 12884901888
shared_server_sessions
shared_servers 0
ERROR:
ORA-04031: unable to allocate 48 bytes of shared memory ("shared pool","BEGIN DBMS_OUTPUT.ENABLE(NUL...","parameters","kglpda")
REQUEST_FAILURES LAST_FAILURE_SIZE
---------------- -----------------
5679 4200
shared pool中内存大于_SHARED_POOL_RESERVED_MIN_ALLOC 将放入shared pool保留池,保留池维护一个单独的freelist,lru,并且不会在lru列表存recreatable类型chunks,普通shared pool的释放与shared pool保留池无关。REQUEST_FAILURES>0且LAST_FAILURE_SIZE(最后请求内存大小)<_SHARED_POOL_RESERVED_MIN_ALLOC,表示在 shared pool中缺少连续内存,或者库里面有大量的硬解析造成的。一般是绑定变量问题。也就是version count过高。我们顺着这个思路往下继续查看;
也就是说两个sql(7ng34ruy5awxq、2g9nykfyk0a95)有重大影响;
SQL> select sql_id,child_number,BIND_MISMATCH from v$sql_shared_cursor where sql_id='2g9nykfyk0a95' and BIND_MISMATCH='Y' and rownum<10;
SQL_ID CHILD_NUMBER B
------------- ------------ -
2g9nykfyk0a95 4 Y
2g9nykfyk0a95 5 Y
2g9nykfyk0a95 30 Y
2g9nykfyk0a95 0 Y
2g9nykfyk0a95 54 Y
2g9nykfyk0a95 23 Y
2g9nykfyk0a95 27 Y
2g9nykfyk0a95 35 Y
2g9nykfyk0a95 46 Y
9 rows selected.
SQL> select count(*) from v$sql_shared_cursor where sql_id='2g9nykfyk0a95' and BIND_MISMATCH='Y' ;
COUNT(*)
----------
23
SQL>select position,LAST_CAPTURED,datatype_string from v$sql_bind_capture where sql_id='2g9nykfyk0a95' and rownum<50
POSITION LAST_CAPTURE DATATYPE_STRING
---------- ------------ ------------------------------
1 VARCHAR2(128)
2 VARCHAR2(128)
3 VARCHAR2(128)
4 VARCHAR2(32)
5 VARCHAR2(128)
6 VARCHAR2(32)
7 TIMESTAMP
8 TIMESTAMP
9 VARCHAR2(32)
10 VARCHAR2(32)
11 VARCHAR2(32)
12 VARCHAR2(128)
13 VARCHAR2(128)
14 VARCHAR2(32)
15 VARCHAR2(32)
16 VARCHAR2(32)
17 VARCHAR2(32)
18 VARCHAR2(32)
19 VARCHAR2(32)
20 VARCHAR2(32)
21 VARCHAR2(32)
22 VARCHAR2(32)
23 VARCHAR2(32)
24 VARCHAR2(32)
25 VARCHAR2(32)
26 VARCHAR2(32)
27 VARCHAR2(32)
28 VARCHAR2(32)
29 VARCHAR2(32)
30 VARCHAR2(32)
31 NUMBER
1 VARCHAR2(128)
2 VARCHAR2(128)
3 VARCHAR2(128)
4 VARCHAR2(32)
5 VARCHAR2(128)
6 VARCHAR2(32)
7 TIMESTAMP
8 TIMESTAMP
9 VARCHAR2(32)
10 VARCHAR2(32)
11 VARCHAR2(32)
12 VARCHAR2(128)
13 VARCHAR2(128)
14 VARCHAR2(32)
15 VARCHAR2(32)
16 VARCHAR2(32)
17 VARCHAR2(2000)
18 VARCHAR2(32)
49 rows selected.
禁用相关应用后,接着往下分析,看看还有没有别的地方引起的该4031问题;
NAME BYTES
---------------------------------------- ----------------
obj stat memo 6235601184
free memory 1919164576
object level 1148667072
gcs resources 980982312
gcs shadows 426063424
sql area 401151376
db_block_hash_buckets 188743680
kglsim object batch 179096400
kglsim heap 173694528
CCursor 139486384
Cursor Stats 131995544
ges resource 110674136
library cache 101986224
PCursor 88613936
ges enqueues 87640800
sql area:PLSQL 78693424
ASH buffers 52428800
trace buffer 40927232
KQR L PO 36581592
Checkpoint queue 32776192
state objects 30602616
event statistics per sess 26095360
FileOpenBlock 15936504
ges big msg buffers 15936168
sessions 15163528
KCL name table 12582912
kgllk hash table 10231808
KGLS heap 10113784
simulator hash buckets 8404992
dbwriter coalesce buffer 8392704
gcs res hash bucket 8388608
ges reserved msg buffers 8240008
Heap0: KGL 7905976
object queue 7894320
row cache 7511248
transaction 6885376
KQR L SO 5958168
enqueue 5886080
parameter table block 5331280
procs: ksunfy 5120000
FileIdentificatonBlock 4571216
call 4535640
KCB Table Scan Buffer 4194816
kglsim hash table bkts 4194304
KSFD SGA I/O b 4190328
buffer handles 3600008
DML lock 3541016
KQR M SO 3300680
gcs affinity 3241728
ges process array 3181272
ges resource hash table 2883584
PL/SQL DIANA 2771128
trace buf hdr xtend 2736864
PL/SQL MPCODE 2626400
ges regular msg buffers 2622008
KTI SGA freea 2498560
KGSK scheduler 2358624
ktlbk state objects 2108880
object queue hash buckets 2101248
enqueue resources 1953128
replication session stats 1939520
SGA - SWRF Metric CHBs 1857960
db_files 1777912
kks stbkt 1572864
KEWS sesstat values 1432600
Wait History 1322400
pso tbs: ksunfy 1300000
Sort Segment 1272848
osp allocation 1195984
mvobj part des 1110240
KSXR receive buffers 1036000
SUBPOOL NAME SUM(BYTES) MB
-------------------- ---------------------------------------- ---------- ----------
shared pool (1): free memory 259671360 247.64
shared pool (2): free memory 252015608 240.34
shared pool (3): free memory 277114712 264.28
shared pool (4): free memory 275504440 262.74
shared pool (5): free memory 281692368 268.64
shared pool (6): free memory 268028160 255.61
shared pool (7): free memory 304926192 290.8
期间我们看见obj stat memo排名第一,消耗内存资源5.9G。七个子池,大概share pool剩余空间1.83GB
SQL> SELECT * FROM
2 (SELECT NAME, BYTES/(1024*1024) MB
3 FROM V$SGASTAT
4 WHERE POOL = 'shared pool'
5 ORDER BY BYTES DESC)
6 WHERE ROWNUM <= 10;
NAME MB
-------------------------- ----------
obj stat memo 5955.40194
free memory 2503.24937
object level 1097.0509
gcs resources 935.537636
gcs shadows 406.325745
db_block_hash_buckets 180
kglsim object batch 170.799637
kglsim heap 165.64801
Cursor Stats 125.88076
ges resource 104.992348
SQL>
SQL>
SQL>
SQL> select * from v$sgastat where name = 'obj stat memo';
POOL NAME BYTES
------------ -------------------------- ----------
shared pool obj stat memo 6244703856
SQL>
SQL>
SQL> jselect * from v$sgastat where name = 'obj stat memo';
SP2-0734: unknown command beginning "jselect * ..." - rest of line ignored.
SQL> select * from v$sgastat where name = 'obj stat memo';
POOL NAME BYTES
------------ -------------------------- ----------
shared pool obj stat memo 6245586216
SQL> /
POOL NAME BYTES
------------ -------------------------- ----------
shared pool obj stat memo 6245930952
且obj stat memo一直在增长,无法释放。尝试使用flush share_pool但是obj无视
SQL> alter system flush shared_pool;
SQL> select * from v$sgastat where name = 'obj stat memo';
POOL NAME BYTES
------------ -------------------------- ----------
shared pool obj stat memo 6343766208
无解只好求助support进行搜索,找到一篇
ORA-04031 With Leak in "OBJ STAT MEMO" Allocations Seen in V$SGASTAT on 10.2.0.5 (文档 ID 1350050.1)描述一致;
CAUSE
On 10.2.0.5 an architectural change was made to switch off the publishing of "obj stat del channel" messages by default. This can lead to excessive growth of "obj stat memo" memory allocation.
SOLUTION
On 10.2.0.5, and only for 10.2.0.5, we have introduced the hidden parameter :
_disable_objstat_del_broadcast
If you are seeing ORA-04031 related to the symptoms reported then this parameter can be set to FALSE and by doing so we will no longer see the growth of "obj stat memo" that potentially leads to ORA-04031.
This parameter has been instructed by development to be used as the solution to ORA-04031 with the symptoms reported. There is no patch fix and no patch fix will be made. The hidden parameter will not cause any problems to the database and it must not be accidentally left within the init/spfile when/if the database is upgraded as startup would fail with :
将原隐患参数设置false;
ALTER SYSTEM SET "_disable_objstat_del_broadcast"=FALSE SCOPE=BOTH;
_disable_objstat_del_broadcast为false并不会对数据库造成影响,可以修改,但是请注意假如之后数据库进行升级,需要在参数文件中取消该参数,否则数据库无法正常启动。
因为一直没有释放,将数据库实例重启后,监控该资源情况,得到释放;
SQL>select * from v$sgastat where name = 'obj stat memo';
POOL NAME BYTES
------------ -------------------------- ----------
shared pool obj stat memo 102600
SQL> /
POOL NAME BYTES
------------ -------------------------- ----------
shared pool obj stat memo 143640
总结:1,主要是分享一下相关脚本。
2,分享一下解决故障的思路问题。