客户报表库,HP-ux B11.31 IA64 FOR Oracle 10.2.0.5.0 RAC系统短信过滤alert日志告警

ORA-00604: error occurred at recursive SQL level 1

ORA-04031:unable to allocate 4120 bytes of shared memory ("shared pool","select f.file#, f.block#, f....","Typecheck","kgghteInit")

询问客户SGA、share pool、主机资源等相关情况

SQL> show parameter sga


NAME                                 TYPE        VALUE

------------------------------------ ----------- ------------------------------

lock_sga                             boolean     TRUE

pre_page_sga                         boolean     FALSE

sga_max_size                         big integer 60G

sga_target                           big integer 0

SQL> show parameter pool


NAME                                 TYPE        VALUE

------------------------------------ ----------- ------------------------------

buffer_pool_keep                     string

buffer_pool_recycle                  string

global_context_pool_size             string

java_pool_size                       big integer 512M

large_pool_size                      big integer 512M

olap_page_pool_size                  big integer 0

shared_pool_reserved_size            big integer 644245094

shared_pool_size                     big integer 12G

streams_pool_size                    big integer 416M

SGA总大小60G,share pool总大小12G。有了这个直观的感受,随即将4031收集相关信息的脚本4031_OK-ForAll.sql发给客户,收集现在内存使用情况,脚本内容如下:

/**********************************************************************

 * File: 4031.sql


 * Date: 2012/01/1

 *

 * Modifications:

 * 2012/02/12 Changed v1

*********************************************************************/

 

spool spinfo.txt

SET PAGESIZE 1024

SET LINESIZE 2000

set echo off;

set feedback off;

set heading on;

set trimout on;

set trimspool on;

COL BYTES FORMAT 999999999999999

COL CURRENT_SIZE FORMAT 999999999999999



/* Script Run TimeStamp */

set serveroutput on; 

exec dbms_output.put_line('Script Run TimeStamp');

select to_char(sysdate, 'dd-MON-yyyy hh24:mi:ss') "Script Run TimeStamp" from dual;


set serveroutput on; 

exec dbms_output.put_line('Instance Startup Time');



/*Instance Startup time */

select to_char(startup_time, 'dd-MON-yyyy hh24:mi:ss') "Instance Startup Time" from v$instance;



/* shared pool related hidden parameter */

set serveroutput on; 

exec dbms_output.put_line('shared pool related hidden parameter ');


col name format a40

col value format a80;

select nam.ksppinm NAME,val.KSPPSTVL VALUE from x$ksppi nam,x$ksppsv val where nam.indx = val.indx and nam.ksppinm like '%shared%' order by 1;





/* SUB Pool Number */

set serveroutput on; 

exec dbms_output.put_line('SUB Pool Number ');

col 'Parameter' format a40

col 'Session Value' format a40;

col 'Instance Value' format a40;

select a.ksppinm "Parameter",

b.ksppstvl "Session Value",

c.ksppstvl "Instance Value"

from sys.x$ksppi a, sys.x$ksppcv b, sys.x$ksppsv c

where a.indx = b.indx and a.indx = c.indx

and a.ksppinm like '%_kghdsidx_count%';




/* Each Subpool Size */

set serveroutput on; 

exec dbms_output.put_line('Each Subpool Size');

select ksmchidx poolnumer , sum(ksmchsiz)  poolsize

from x$ksmsp

group by ksmchidx ;




/*  Researved Shared Pool 4031 information */

set serveroutput on; 

exec dbms_output.put_line('Researved Shared Pool 4031 information');

select REQUEST_FAILURES, LAST_FAILURE_SIZE from V$SHARED_POOL_RESERVED;



/* Reaserved Shared Pool Reserved 4031 information */

set serveroutput on; 

exec dbms_output.put_line('Reaserved Shared Pool 4031 information');

select REQUESTS, REQUEST_MISSES, free_space, avg_free_size, free_count, max_free_size from V$SHARED_POOL_RESERVED;



/* Current SGA Buffer & Pool sizes */

set serveroutput on; 

exec dbms_output.put_line('Current SGA Buffer Pool sizes');

select component, current_size from v$sga_dynamic_components;




/* Shared Pool Memory Allocations by Size */

set serveroutput on; 

exec dbms_output.put_line('Shared Pool Memory Allocations by Size');

select name, bytes from v$sgastat

where pool = 'shared pool' and (bytes > 999999 or name = 'free memory')

order by bytes desc;


set serveroutput on; 

exec dbms_output.put_line('show component of shared pool which is bigger than 10MB');


select name, round((bytes/1024/1024),0) "more than 10" from v$sgastat where pool='shared pool' and bytes > 10000000 order by bytes desc;


select sum(bytes) "SHARED POOL TOTAL SIZE" from v$sgastat where pool='shared pool';




/* Total Free of Shared Pool */

set serveroutput on; 

exec dbms_output.put_line('Total Free(not Free) of Shared Pool ');


COL 'Total Shared Pool Usage' FORMAT 999999999999999


select sum(bytes)/1024/1024 "Free MB in Shared Pool" from v$sgastat where pool = 'shared pool' and name = 'free memory';

select sum(bytes) "Not Free MB Shared Pool" from v$sgastat where pool = 'shared pool' and name != 'free memory';



/* current KGLH* usage */

set serveroutput on; 

exec dbms_output.put_line('current KGLH* usage');


select name, bytes from v$sgastat where pool = 'shared pool' and name in ('KGLHD','KGHL0');




/* Hisotry KGLH* usage */

set serveroutput on; 

exec dbms_output.put_line('Hisotry KGLH* usage');

select bytes/1024/1024 , s.snap_id, begin_interval_time START_TIME

  from dba_hist_sgastat g, dba_hist_snapshot s

  where name='KGLHD'

  and pool='shared pool'

  and trunc(begin_interval_time) >= '30-DEC-2011'

  and s.snap_id = g.snap_id

  order by 2;



set serveroutput on; 

exec dbms_output.put_line('Hisotry KGLH0* usage');

select bytes/1024/1024 , s.snap_id, begin_interval_time START_TIME

  from dba_hist_sgastat g, dba_hist_snapshot s

  where name='KGLH0'

  and pool='shared pool'

  and trunc(begin_interval_time) >= '30-DEC-2011'

  and s.snap_id = g.snap_id

  order by 2;

  



/*  History of Shared pool allocations in a speciifed Day*/

set serveroutput on; 

exec dbms_output.put_line('history of Shared pool allocations in a speciifed Day');

col name format a30

select n,

  max(decode(to_char(begin_interval_time, 'hh24'), 1,bytes, null)) "1",

  max(decode(to_char(begin_interval_time, 'hh24'), 2,bytes, null)) "2",

  max(decode(to_char(begin_interval_time, 'hh24'), 3,bytes, null)) "3",

  max(decode(to_char(begin_interval_time, 'hh24'), 4,bytes, null)) "4",

  max(decode(to_char(begin_interval_time, 'hh24'), 5,bytes, null)) "5",

  max(decode(to_char(begin_interval_time, 'hh24'), 6,bytes, null)) "6",

  max(decode(to_char(begin_interval_time, 'hh24'), 7,bytes, null)) "7",

  max(decode(to_char(begin_interval_time, 'hh24'), 8,bytes, null)) "8",

  max(decode(to_char(begin_interval_time, 'hh24'), 9,bytes, null)) "9",

  max(decode(to_char(begin_interval_time, 'hh24'), 10,bytes, null)) "10",

  max(decode(to_char(begin_interval_time, 'hh24'), 11,bytes, null)) "11",

  max(decode(to_char(begin_interval_time, 'hh24'), 12,bytes, null)) "12",

  max(decode(to_char(begin_interval_time, 'hh24'), 13,bytes, null)) "13",

  max(decode(to_char(begin_interval_time, 'hh24'), 14,bytes, null)) "14",

  max(decode(to_char(begin_interval_time, 'hh24'), 15,bytes, null)) "15",

  max(decode(to_char(begin_interval_time, 'hh24'), 16,bytes, null)) "16",

  max(decode(to_char(begin_interval_time, 'hh24'), 17,bytes, null)) "17",

  max(decode(to_char(begin_interval_time, 'hh24'), 18,bytes, null)) "18",

  max(decode(to_char(begin_interval_time, 'hh24'), 19,bytes, null)) "19",

  max(decode(to_char(begin_interval_time, 'hh24'), 20,bytes, null)) "20",

  max(decode(to_char(begin_interval_time, 'hh24'), 21,bytes, null)) "21",

  max(decode(to_char(begin_interval_time, 'hh24'), 22,bytes, null)) "22",

  max(decode(to_char(begin_interval_time, 'hh24'), 23,bytes, null)) "23",

  max(decode(to_char(begin_interval_time, 'hh24'), 24,bytes, null)) "24"

from (select '"'||name||'"' n, begin_interval_time, bytes from dba_hist_sgastat a, dba_hist_snapshot b

where pool='shared pool' and a.snap_id=b.snap_id

and to_char(begin_interval_time,'hh24:mi') between '01:00' and '24:00'

and to_char(begin_interval_time,'dd-mon')  = to_char(sysdate-1, 'dd-mon'))

group by n;




/* Each Subpool sumary usage for free memory , may slow ,it depends on custoemr database workload   */

set serveroutput on; 

exec dbms_output.put_line('Each Subpool sumary usage for free memory');

col subpool format a20

col name format a40

SELECT

        subpool

      , name

      , SUM(bytes)

      , ROUND(SUM(bytes)/1048576,2) MB

    FROM (

        SELECT

            'shared pool ('||DECODE(TO_CHAR(ksmdsidx),'0','0 - Unused',ksmdsidx)||'):'      subpool

          , ksmssnam      name

         , ksmsslen      bytes

       FROM

           x$ksmss

       WHERE

           ksmsslen > 0

       AND LOWER(ksmssnam) LIKE LOWER('%free memory%')

   )

   GROUP BY

       subpool

     , name

   ORDER BY

       subpool    ASC

     , SUM(bytes) DESC ;

     



/* Memory  fragment and  chunk allocation like  0-1K,1-2K, may slow ,it depends on custoemr database workload */


set serveroutput on; 

exec dbms_output.put_line('Memory  fragment and  chunk allocation like  0-1K,1-2K');


col SubPool format 999

col mb format 999,999

col name heading "Name"


SELECT ksmchidx "SubPool",

       'sga heap(' || ksmchidx || ',0)' sga_heap,

       ksmchcom chunkcomment,

       DECODE(ROUND(ksmchsiz / 1000),

              0,

              '0-1K',

              1,

              '1-2K',

              2,

              '2-3K',

              3,

              '3-4K',

              4,

              '4-5K',

              5,

              '5-6k',

              6,

              '6-7k',

              7,

              '7-8k',

              8,

              '8-9k',

              9,

              '9-10k',

              '> 10K'

              ) "size",

       COUNT(*),

       ksmchcls status,

       SUM(ksmchsiz) BYTES

  FROM x$ksmsp

 WHERE ksmchcom = 'free memory'

 GROUP BY ksmchidx,

          ksmchcls,

          'sga heap(' || ksmchidx || ',0)',

          ksmchcom,

          ksmchcls,

          DECODE(ROUND(ksmchsiz / 1000),

                 0,

                 '0-1K',

                 1,

                 '1-2K',

                 2,

                 '2-3K',

                 3,

                 '3-4K',

                 4,

                 '4-5K',

                 5,

                 '5-6k',

                 6,

                 '6-7k',

                 7,

                 '7-8k',

                 8,

                 '8-9k',

                 9,

                 '9-10k',

                 '> 10K');





select to_char(sysdate,'YYYY-MM-DD HH24:MI:SS') "Script END TimeStamp"  from dual;



spool off;


执行后结果为:

NAME                                     VALUE

---------------------------------------- --------------------------------------------------------------------------------

__shared_pool_size                       12884901888

_all_shared_dblinks

_dm_max_shared_pool_pct                  1

_enable_shared_pool_durations            FALSE

_io_shared_pool_size                     4194304

_shared_pool_max_size                    0

_shared_pool_minsize_on                  FALSE

_shared_pool_reserved_min_alloc          4400

_shared_pool_reserved_pct                5

_shared_server_spare_param1

_shared_server_spare_param2

_shared_server_spare_param3

_skgxp_shared_port                       0

hi_shared_memory_address                 0

max_shared_servers

shared_memory_address                    0

shared_pool_reserved_size                644245094

shared_pool_size                         12884901888

shared_server_sessions

shared_servers                           0

ERROR:

ORA-04031: unable to allocate 48 bytes of shared memory ("shared pool","BEGIN DBMS_OUTPUT.ENABLE(NUL...","parameters","kglpda")

REQUEST_FAILURES LAST_FAILURE_SIZE

---------------- -----------------

            5679              4200

shared pool中内存大于_SHARED_POOL_RESERVED_MIN_ALLOC 将放入shared pool保留池,保留池维护一个单独的freelist,lru,并且不会在lru列表存recreatable类型chunks,普通shared pool的释放与shared pool保留池无关。

REQUEST_FAILURES>0且LAST_FAILURE_SIZE(最后请求内存大小)<_SHARED_POOL_RESERVED_MIN_ALLOC,表示在 shared pool中缺少连续内存,或者库里面有大量的硬解析造成的。一般是绑定变量问题。也就是version count过高。我们顺着这个思路往下继续查看;


ORA-04031:unable to allocate 4120 bytes of shared memory ‘obj stat mem’故障解决_ORA

也就是说两个sql(7ng34ruy5awxq、2g9nykfyk0a95)有重大影响;

 

SQL> select sql_id,child_number,BIND_MISMATCH from v$sql_shared_cursor where sql_id='2g9nykfyk0a95' and BIND_MISMATCH='Y' and rownum<10;

SQL_ID        CHILD_NUMBER B
------------- ------------ -
2g9nykfyk0a95            4 Y
2g9nykfyk0a95            5 Y
2g9nykfyk0a95           30 Y
2g9nykfyk0a95            0 Y
2g9nykfyk0a95           54 Y
2g9nykfyk0a95           23 Y
2g9nykfyk0a95           27 Y
2g9nykfyk0a95           35 Y
2g9nykfyk0a95           46 Y

9 rows selected.

SQL> select count(*) from v$sql_shared_cursor where sql_id='2g9nykfyk0a95' and BIND_MISMATCH='Y' ;

  COUNT(*)
----------
        23

SQL>select position,LAST_CAPTURED,datatype_string from v$sql_bind_capture where sql_id='2g9nykfyk0a95' and rownum<50


  POSITION LAST_CAPTURE DATATYPE_STRING

---------- ------------ ------------------------------

         1              VARCHAR2(128)

         2              VARCHAR2(128)

         3              VARCHAR2(128)

         4              VARCHAR2(32)

         5              VARCHAR2(128)

         6              VARCHAR2(32)

         7              TIMESTAMP

         8              TIMESTAMP

         9              VARCHAR2(32)

        10              VARCHAR2(32)

        11              VARCHAR2(32)

        12              VARCHAR2(128)

        13              VARCHAR2(128)

        14              VARCHAR2(32)

        15              VARCHAR2(32)

        16              VARCHAR2(32)

        17              VARCHAR2(32)

        18              VARCHAR2(32)

        19              VARCHAR2(32)

        20              VARCHAR2(32)

        21              VARCHAR2(32)

        22              VARCHAR2(32)

        23              VARCHAR2(32)

        24              VARCHAR2(32)

        25              VARCHAR2(32)

        26              VARCHAR2(32)

        27              VARCHAR2(32)

        28              VARCHAR2(32)

        29              VARCHAR2(32)

        30              VARCHAR2(32)

        31              NUMBER

         1              VARCHAR2(128)

         2              VARCHAR2(128)

         3              VARCHAR2(128)

         4              VARCHAR2(32)

         5              VARCHAR2(128)

         6              VARCHAR2(32)

         7              TIMESTAMP

         8              TIMESTAMP

         9              VARCHAR2(32)

        10              VARCHAR2(32)

        11              VARCHAR2(32)

        12              VARCHAR2(128)

        13              VARCHAR2(128)

        14              VARCHAR2(32)

        15              VARCHAR2(32)

        16              VARCHAR2(32)

        17              VARCHAR2(2000)

        18              VARCHAR2(32)


49 rows selected.

禁用相关应用后,接着往下分析,看看还有没有别的地方引起的该4031问题;

NAME                                                BYTES

---------------------------------------- ----------------

obj stat memo                                  6235601184

free memory                                    1919164576

object level                                   1148667072

gcs resources                                   980982312

gcs shadows                                     426063424

sql area                                        401151376

db_block_hash_buckets                           188743680

kglsim object batch                             179096400

kglsim heap                                     173694528

CCursor                                         139486384

Cursor Stats                                    131995544

ges resource                                    110674136

library cache                                   101986224

PCursor                                          88613936

ges enqueues                                     87640800

sql area:PLSQL                                   78693424

ASH buffers                                      52428800

trace buffer                                     40927232

KQR L PO                                         36581592

Checkpoint queue                                 32776192

state objects                                    30602616

event statistics per sess                        26095360

FileOpenBlock                                    15936504

ges big msg buffers                              15936168

sessions                                         15163528

KCL name table                                   12582912

kgllk hash table                                 10231808

KGLS heap                                        10113784

simulator hash buckets                            8404992

dbwriter coalesce buffer                          8392704

gcs res hash bucket                               8388608

ges reserved msg buffers                          8240008

Heap0: KGL                                        7905976

object queue                                      7894320

row cache                                         7511248

transaction                                       6885376

KQR L SO                                          5958168

enqueue                                           5886080

parameter table block                             5331280

procs: ksunfy                                     5120000

FileIdentificatonBlock                            4571216

call                                              4535640

KCB Table Scan Buffer                             4194816

kglsim hash table bkts                            4194304

KSFD SGA I/O b                                    4190328

buffer handles                                    3600008

DML lock                                          3541016

KQR M SO                                          3300680

gcs affinity                                      3241728

ges process array                                 3181272

ges resource hash table                           2883584

PL/SQL DIANA                                      2771128

trace buf hdr xtend                               2736864

PL/SQL MPCODE                                     2626400

ges regular msg buffers                           2622008

KTI SGA freea                                     2498560

KGSK scheduler                                    2358624

ktlbk state objects                               2108880

object queue hash buckets                         2101248

enqueue resources                                 1953128

replication session stats                         1939520

SGA - SWRF Metric CHBs                            1857960

db_files                                          1777912

kks stbkt                                         1572864

KEWS sesstat values                               1432600

Wait History                                      1322400

pso tbs: ksunfy                                   1300000

Sort Segment                                      1272848

osp allocation                                    1195984

mvobj part des                                    1110240

KSXR receive buffers                              1036000


SUBPOOL              NAME                                     SUM(BYTES)         MB

-------------------- ---------------------------------------- ---------- ----------

shared pool (1):     free memory                               259671360     247.64

shared pool (2):     free memory                               252015608     240.34

shared pool (3):     free memory                               277114712     264.28

shared pool (4):     free memory                               275504440     262.74

shared pool (5):     free memory                               281692368     268.64

shared pool (6):     free memory                               268028160     255.61

shared pool (7):     free memory                               304926192      290.8


期间我们看见obj stat memo排名第一,消耗内存资源5.9G。七个子池,大概share pool剩余空间1.83GB

SQL> SELECT * FROM

  2  (SELECT NAME, BYTES/(1024*1024) MB

  3  FROM V$SGASTAT

  4  WHERE POOL = 'shared pool'

  5  ORDER BY BYTES DESC)

  6  WHERE ROWNUM <= 10;


NAME                               MB

-------------------------- ----------

obj stat memo              5955.40194

free memory                2503.24937

object level                1097.0509

gcs resources              935.537636

gcs shadows                406.325745

db_block_hash_buckets             180

kglsim object batch        170.799637

kglsim heap                 165.64801

Cursor Stats                125.88076

ges resource               104.992348

SQL> 

SQL> 

SQL> 

SQL>  select * from v$sgastat where name = 'obj stat memo';


POOL         NAME                            BYTES

------------ -------------------------- ----------

shared pool  obj stat memo              6244703856

SQL> 

SQL> 

SQL> jselect * from v$sgastat where name = 'obj stat memo';

SP2-0734: unknown command beginning "jselect * ..." - rest of line ignored.

SQL>  select * from v$sgastat where name = 'obj stat memo';


POOL         NAME                            BYTES

------------ -------------------------- ----------

shared pool  obj stat memo              6245586216

SQL> /


POOL         NAME                            BYTES

------------ -------------------------- ----------

shared pool  obj stat memo              6245930952

且obj stat memo一直在增长,无法释放。尝试使用flush share_pool但是obj无视

SQL> alter system flush shared_pool;

SQL> select * from v$sgastat where name = 'obj stat memo';


POOL         NAME                            BYTES

------------ -------------------------- ----------

shared pool  obj stat memo              6343766208


无解只好求助support进行搜索,找到一篇

ORA-04031 With Leak in "OBJ STAT MEMO" Allocations Seen in V$SGASTAT on 10.2.0.5 (文档 ID 1350050.1)描述一致;

CAUSE

On 10.2.0.5 an architectural change was made to switch off the publishing of "obj stat del channel" messages by default.  This can lead to excessive growth of "obj stat memo" memory allocation.


SOLUTION

On 10.2.0.5, and only for 10.2.0.5, we have introduced the hidden parameter :

_disable_objstat_del_broadcast

If you are seeing ORA-04031 related to the symptoms reported then this parameter can be set to FALSE and by doing so we will no longer see the growth of "obj stat memo" that potentially leads to ORA-04031.


This parameter has been instructed by development to be used as the solution to ORA-04031 with the symptoms reported. There is no patch fix and no patch fix will be made.  The hidden parameter will not cause any problems to the database and it must not be accidentally left within the init/spfile when/if the database is upgraded as startup would fail with :

将原隐患参数设置false;

ALTER SYSTEM SET "_disable_objstat_del_broadcast"=FALSE SCOPE=BOTH;

_disable_objstat_del_broadcast为false并不会对数据库造成影响,可以修改,但是请注意假如之后数据库进行升级,需要在参数文件中取消该参数,否则数据库无法正常启动。

因为一直没有释放,将数据库实例重启后,监控该资源情况,得到释放;

SQL>select * from v$sgastat where name = 'obj stat memo';


POOL         NAME                            BYTES

------------ -------------------------- ----------

shared pool  obj stat memo                  102600


SQL> /


POOL         NAME                            BYTES

------------ -------------------------- ----------

shared pool  obj stat memo                  143640


总结:1,主要是分享一下相关脚本。

    2,分享一下解决故障的思路问题。