cache buffer chain latch

转载

xjsunjie 2011-07-06 14:39:44 博主文章分类：数据库与大数据

文章标签 休闲 buffer chain cache latch 文章分类 Oracle 数据库

cache buffers chains锁存器的争用原因一：低效率的SQL语句

在某些环境中，应用程序打开执行相同的低效率SQL语句的多个并发会话，这些SQL语句都设法得到相同的数据集。较小的逻辑读意味着较少的latch get操作，从而减少锁存器争用并改善性能。

每次执行都带有高 BUFFER_GETS(逻辑读取)的SQL语句是主要的原因。

>> cache buffers chains锁存器的争用原因二：热块

当多个会话重复访问一个或多个由同一个子cache buffers chains锁存器保护的块时，热块就会产生。当多个会话争用cache buffers chains锁存器时，找出是否有热块的最好的方法是检查latch free等待事件的P1RAW参数值。

XXXX数据库优化调整建议

运行现象和故障描述

10月22日中午之后，XXXX数据库系统性能急剧恶化，出现大量的并发性冲突，表现为等待大量的latch free，latch: cache buffer chain以及latch: library cache 。操作系统CPU消耗达到100%，系统性能响应极为缓慢。

下面是运行期间的AWR报告：

Per Second

Per Transaction

Redo size:

114,459.47

6,101.89

Logical reads:

488,049.46

26,018.15

Block changes:

444.92

23.72

Physical reads:

2,637.96

140.63

Physical writes:

507.44

27.05

User calls:

616.09

32.84

Parses:

388.48

20.71

Hard parses:

14.67

0.78

Sorts:

154.18

8.22

Logons:

5.88

0.31

Executes:

651.57

34.74

Transactions:

18.76

% Blocks changed per Read:

0.09

Recursive Call %:

69.88

Rollback per transaction %:

2.05

Rows per Sort:

126.60

Instance Efficiency Percentages (Target 100%)

Buffer Nowait %:

100.00

Redo NoWait %:

100.00

Buffer Hit %:

99.55

In-memory Sort %:

100.00

Library Hit %:

98.80

Soft Parse %:

96.22

Execute to Parse %:

40.38

Latch Hit %:

88.19

Parse CPU to Parse Elapsd %:

7.39

% Non-Parse CPU:

97.93

Shared Pool Statistics

Begin

End

Memory Usage %:

85.74

88.10

% SQL with executions>1:

42.86

44.45

% Memory for SQL w/exec>1:

70.86

78.28

Top 5 Timed Events

Event

Waits

Time(s)

Avg Wait(ms)

% Total Call Time

Wait Class

latch: cache buffers chains

385,694

58,616

152

25.5

Concurrency

latch free

151,833

32,221

212

14.0

Other

CPU time

13,749

6.0

latch: library cache

45,698

3,226

1.4

Concurrency

direct path write temp

238,321

2,299

1.0

User I/O

从latch信息可以看到：

cache buffers chains 2,498,511,076 13.36 0.00 58671 21,510,689 0.40

cache buffers lru chain 378,882 0.55 0.08 9 22,568,763 1.07

library cache 34,822,373 0.59 0.22 3226 194,268 1.79

row cache objects 147,765,275 2.52 0.00 84 2,277 0.04

simulator lru latch 46,830,334 1.20 0.27 32216 2,168,754 0.16

这些相关的latch冲突比较高，尤其对于cache buffer chain和simulator lru latch冲突几乎频繁发生，并且获取时间很长，通过大量spin消耗CPU时间。

以下表格消耗大量CPU并且引发LATCH冲突：

Segments by Logical Reads

Total Logical Reads: 1,758,696,980
Captured Segments account for 94.8% of Total

Owner

Tablespace Name

Object Name

Subobject Name

Obj. Type

Logical Reads

%Total

NBAIC

GSNEW

PK_BM_GLWG

INDEX

996,334,272

56.65

NBAIC

PARTITION_TBS

HK_JD

HK_JD_P12

TABLE PARTITION

270,731,328

15.39

NBAIC

GSNEW

HK_SSJH_HK

TABLE

159,548,544

9.07

WLJJJG

GSNEW

WL_SSJG

TABLE

58,100,944

3.30

NBAIC

GSNEW

SYS_SPLC

TABLE

19,291,264

1.10

以下SQL语句消耗大量CPU并且引发LATCH FREE，消耗CPU资源。

另外从SQL信息可以看到引发大量cache buffer chain以及simulator lru latch冲突几乎都由相同的SQL语句引起，这些语句极度消耗CPU资源，从而导致系统响应缓慢。

Select * from (select row_.*, rownum rownum_ from (select a.xh, a.jhmc, a.kssj, a.jssj, a.fbr, a.fbsj, a.zrr, a.jhnr, a.gxdw, (select count(b.nbxh) from hk_ssjh_hk b, hk_jd m, bm_glwg n where a.xh=b.ssjhxh and b.wcbj=’0′ and b.nbxh=m.nbxh and m.glwg = n.code(+) and m.gxdw like ‘330281%’ ) jhhk, (select count(b.nbxh) from hk_ssjh_hk b, hk_jd m, bm_glwg n where a.xh=b.ssjhxh and b.wcbj=’1′ and b.nbxh=m.nbxh and m.glwg = n.code(+) and m.gxdw like ‘330281%’ ) ywchk, (select count(wzxh) from hk_ssjh_wz c, hk_wzjy m where a.xh=c.ssjhxh and c.wcbj=’0′ and c.wzxh = m.xh and m.gxdw like ‘330281%’ ) jhwz, (select count(wzxh) from hk_ssjh_wz c, hk_wzjy m where a.xh=c.ssjhxh and c.wcbj=’1′ and c.wzxh = m.xh and m.gxdw like ‘330281%’ ) ywcwz from hk_ssjh a where 1 = 1 and a.gxdw like ‘330281%’ and a.gxdw = ‘33028103′ order by a.fbsj desc )row_ where rownum<=20) where rownum_>0

类似的语句几乎消耗了系统80%以上的CPU资源，这些语句同样引起了大量buffer cache冲突。

该语句写法存在问题，存在着大量过多的cache访问，频繁进行latch锁定，从而导致系统故障。

故障诊断：

从以上现象和数据中，SQL语句是引起性能问题的最主要原因。高消耗语句的高并发导致系统资源极度紧张，从而引发一些Oracle在高负载下面的Oracle bug，使业务系统性能更加缓慢。

Oracle在10.2.0.3下面的相关bug：

Latch Contention

6455161+

Higher CPU / Higher “cache buffer chains” latch gets / Higher “consistent gets” after truncate/Rebuild

4691237

High “library cache” latch gets from SQL using objects in PLSQL

5749075

High Requests on dc_rollback_segments. latch / US enqueue contention

5918642

Heavy latch contention with DB_CACHE_ADVICE on

6333663

Shared pool latch contention due to fragmentation of large pool

6356566

Memory leak / high CPU selecting from V$SQL_PLAN (affects statspack)

解决方案：

（一）、简单控制该类语句并发规模

该方法简单可靠，可以迅速实施。

（二）、系统优化方案和步骤

（1）、修改参数

关闭缓存建议：

Alter system set db_cache_advice=off scope=both

关闭内存自动调整：

Alter system set db_cache_size=3g scope=both

Alter system set shared_pool_size=2g

Alter system set sga_target=0 scope=both

调整dbwr参数

Alter system set db_writer_processes = 8 scope=both

修改一些内部参数：

Alter system set “_cursor_plan_unparse_enabled” = false scope=both

alter system set "_smu_debug_mode"=33554432 scope=both;

以上参数都可以在线修改。

（2）、检查是否存在truncate或者rebuild的索引

select owner||'.'||Object_name||'.'||subobject_name

from dba_objects

where object_id<>data_object_id

and object_type like 'INDEX%'

order by owner,object_name,subobject_name;

如果存在这些索引，删除索引并且重新创建这些索引。

（3）、修正SQL语句

主要该语句引起，需要选择另外一种写法。

（4）、调度时间应用Oracle patchset。

在合适的时间段给Oracle应用patchset。

总结：

虽然oracle 10.2.0.3中存在类似的bug不少，但是引发的根本在SQL语句效率低下引起。

上一篇：LVM逻辑卷管理

下一篇：oracle审计导致的系统性能故障一例

提问和评论都可以，用心的回复会被更多人看到评论

发布评论

相关文章

官方博客	全部文章	热门标签	班级博客
了解我们	网站地图	意见反馈

鸿蒙开发者社区	51CTO学堂
51CTO	软考资讯

cache buffer chain latch

cache buffer chain latch

51CTO博客