gv$active_session_history会每秒钟将数据库所有节点的Active Session采样一次,而dba_hist_active_sess_history则会将gv$active_session_history里的数据每10秒采样一次并持久化保存。
基于这个特征,我们可以通过分析dba_hist_active_sess_history的Session采样情况,来定位问题发生的准确时间范围,并且可以观察每个采样点的top event和top holder。
一些容易误解含义和需要关注的列说明
SEQ#
sid和serial#相同的event并不代表是同一个event等待,有时候查看等待事件需要关注这个列,这列值相同代表相同的event上的等待。
SESSION_STATE
waiting代表等待,此时结合event才可以说明等待在这个等待事件上,否则则是on cpu。这个问题在v$session上会经常出现,v$session.state不是waiting时,event代表上个已经经历过的等待,需要注意区分。
WAIT_TIME
当等于0时,表示当前在等待。当这个值是非0值时,代表上个event的等待时间。
TIME_WAITED
结合session_state, 当session_states值是waiting时,代表花在等待上的时间,单位是microseconds,如果采样的数据不止一行,则在最后一行统计累积时间。
BLOCKING_SESSION、BLOCKING_INST_ID
阻塞者会话和阻塞者实例,对于有些等待这个值是不准的,还是要根据具体的P1和P2值的含义去计算
TOP_LEVEL_SQL_ID
top level调用,有时候可以根据这个列查出是什么存储过程中调用的SQL。
找到ASH活动会话最高的SAMPLE_ID
select sample_id,sample_time,count(*) from v$active_session_history awhere sample_time between to_date('202303231334','yyyymmddhh24miss') and to_date('202303231336','yyyymmddhh24miss') group by sample_id,sample_time order by 3 desc;
查看TOP 等待和TOP SQL
select event,count(*) from v$active_session_history awhere sample_id in (235564310) group by event order by 2 desc;
select sql_id,count(*) from v$active_session_history awhere sample_id in (235564310) group by sql_id--,sample_time--,sql_exec_start order by 2 desc;
和昨天对比,TOP SQL执行量是否突变
select trunc(sample_time,'mi'),sql_id,count(*) from v$active_session_historywhere (sample_time between to_date('202303221300','yyyymmddhh24mi') and to_date('202303221340','yyyymmddhh24mi') or sample_time between to_date('202303231300','yyyymmddhh24mi') and to_date('202303231340','yyyymmddhh24mi'))and sql_id='90xxxxxxxxxx'group by sql_id,trunc(sample_time,'mi')order by 1;
看执行时间是否异常,是不是绑定变量的影响【查看一条SQL慢是否和绑定变量有关系】
隔五分钟执行两次,记录结果
select sysdate,sql_id,elapsed_time,buffer_gets,executions from v$sqlstats a where sql_id='90xxxxxxxxxx';
SYSDATE SQL_ID ELAPSED_TIME BUFFER_GETS EXECUTIONS
2023/3/28 10:31:18 90xxxxxxxxxx 768988772 113364669 1054100
2023/3/28 10:36:34 90xxxxxxxxxx 793996070 117662867 1092692
select (1092692-1054100) as excutions,round((117662867-113364669)/(1092692-1054100)) as per_gets,round((793996070-768988772)/(1092692-1054100)) as per_time from dual;EXCUTIONS PER_GETS PER_TIME 38592 111 648
和ASH的对比,如果和ASH慢的时间相差很大,说明是和绑定变量有关系
select * from (select sql_id,max(sample_time),max(sample_time - SQL_EXEC_START) as duration,sql_exec_id,sql_plan_hash_value,sql_opname from v$active_session_history awhere sample_time between to_date('202304171441','yyyymmddhh24miss') and to_date('202304171459','yyyymmddhh24miss') and sql_id in ('90xxxxxxxxxx') group by sql_id,sql_exec_id,sql_plan_hash_value,sql_opname ) where duration is not null order by 2
创建ASH临时分析表
依据gv$active_session_history
create table m_ash_0728 tablespace users as
select *
from gv$active_session_history
where sample_time between
to_timestamp('2023/07/28 13:00', 'yyyy/mm/dd hh24:mi') and
to_timestamp('2023/07/28 15:00', 'yyyy/mm/dd hh24:mi');
create table m_sqltext tablespace users as select * from gv$sql;
依据dba_hist_active_sess_history
create table m_ash_0728 tablespace users as
select *
from dba_hist_active_sess_history
where sample_time between
to_timestamp('2023/07/28 13:00', 'yyyy/mm/dd hh24:mi') and
to_timestamp('2023/07/28 15:00', 'yyyy/mm/dd hh24:mi');
create table m_sqltext tablespace users as select * from dba_hist_sqltext;
导出
exp userid=\'/ as sysdba\' file=ashdata.dmp tables='m_ash_0728','m_sqltext'
导入
imp \'/ as sysdba\' file=ashdata.dmp log=ashdata.log full=y
常用确认问题流程(干货)
1.查看ASH时间范围
set line 160 pages 999
col min_time for a25
col max_time for a25
alter session set nls_timestamp_format='yyyy-mm-dd hh24:mi:ss.ff';
select instance_number,
min(sample_time) min_time,
max(sample_time) max_time,
count(*) session_count
from (select *
from dba_hist_active_sess_history
where sample_time between
to_timestamp('2023/06/17 09:00', 'yyyy/mm/dd hh24:mi') and
to_timestamp('2023/06/17 12:00', 'yyyy/mm/dd hh24:mi'))
group by instance_number
order by instance_number;
2.确认问题发生的精确时间范围
set line 160 pages 999
col sample_time for a25
alter session set nls_timestamp_format='yyyy-mm-dd hh24:mi:ss.ff';
select instance_number, sample_id, sample_time, count(*) session_count
from (select *
from dba_hist_active_sess_history
where sample_time between
to_timestamp('2023/06/17 09:00', 'yyyy/mm/dd hh24:mi') and
to_timestamp('2023/06/17 12:00', 'yyyy/mm/dd hh24:mi')) t
group by instance_number, sample_id, sample_time
order by instance_number, sample_time;
3.确定每个采样点的 top n event
Select t.sample_id,
t.sample_time,
t.instance_number,
t.event,
t.session_state,
t.c session_count
From (Select t.*,
rank() over(Partition By instance_number, sample_time Order By c Desc) r
From (Select /*+ parallel(8) */
t.*,
Count(*) over(Partition By instance_number, sample_time, event) c,
row_number() over(Partition By instance_number, sample_time, event Order By 1) r1
From (select *
from dba_hist_active_sess_history
where sample_time between
to_timestamp('2023/06/17 09:00',
'yyyy/mm/dd hh24:mi') and
to_timestamp('2023/06/17 12:00',
'yyyy/mm/dd hh24:mi')) t
Where sample_time >
to_timestamp('2023/06/17 01:25:01',
'yyyy-mm-dd hh24:mi:ss')
And sample_time <
to_timestamp('2023/06/17 12:42:05',
'yyyy-mm-dd hh24:mi:ss')) t
Where r1 = 1) t
Where r < 5
Order By instance_number, sample_time, r;
- 观察每个采样点的等待链
Select /*+parallel(8)*/
Level lv,
connect_by_isleaf isleaf,
connect_by_iscycle iscycle,
t.sample_id,
t.sample_time,
t.instance_number,
t.session_id,
t.sql_id,
t.session_type,
t.event,
t.session_state,
t.blocking_inst_id,
t.blocking_session,
t.blocking_session_status
From (select *
from dba_hist_active_sess_history
where sample_time between
to_timestamp('2023/06/17 09:00', 'yyyy/mm/dd hh24:mi') and
to_timestamp('2023/06/17 12:00', 'yyyy/mm/dd hh24:mi')) t
Where sample_time >
to_timestamp('2023-06-17 06.02.08', 'yyyy-mm-dd hh24:mi:ss')
And sample_time <
to_timestamp('2023-06-17 22.32.31', 'yyyy-mm-dd hh24:mi:ss')
Start With blocking_session Is Not Null
Connect By nocycle
Prior sample_time = sample_time
/*and ((prior sample_time) - sample_time between interval '-1'
second and interval '1' second)*/
And Prior blocking_inst_id = instance_number
And Prior blocking_session = session_id
And Prior blocking_session_serial# = session_serial#
Order Siblings By sample_time;
- 基于上一步的原理来找出每个采样点的最终top holder
Select t.lv,
t.iscycle,
t.sample_id,
t.sample_time,
t.instance_number,
t.session_id,
t.sql_id,
t.session_type,
t.event,
t.seq#,
t.session_state,
t.blocking_inst_id,
t.blocking_session,
t.blocking_session_status,
t.c blocking_session_count
From (Select t.*,
row_number() over(Partition By instance_number, sample_time Order By c Desc) r
From (Select t.*,
Count(*) over(Partition By instance_number, sample_time, session_id) c,
row_number() over(Partition By instance_number, sample_time, session_id Order By 1) r1
From (Select /*+ parallel(8) */
Level lv,
connect_by_isleaf isleaf,
connect_by_iscycle iscycle,
t.*
From (select *
from dba_hist_active_sess_history
where sample_time between
to_timestamp('2023/06/17 09:00',
'yyyy/mm/dd hh24:mi') and
to_timestamp('2023/06/17 12:00',
'yyyy/mm/dd hh24:mi')) t
Where sample_time >
to_timestamp('2023-06-17 02.02.08',
'yyyy-mm-dd hh24:mi:ss')
And sample_time <
to_timestamp('2023-06-17 22.32.31',
'yyyy-mm-dd hh24:mi:ss')
Start With blocking_session Is Not Null
Connect By nocycle
Prior sample_time = sample_time
/*and ((prior sample_time) - sample_time between interval '-1'
second and interval '1' second)*/
And Prior blocking_inst_id = instance_number
And Prior blocking_session = session_id
And Prior
blocking_session_serial# = session_serial#) t
Where t.isleaf = 1) t
Where r1 = 1) t
Where r < 5
Order By sample_time, r;
- 整体一个时间段的等待链
With ash As
(Select /*+ materialize*/
*
From (select *
from dba_hist_active_sess_history
where sample_time between
to_timestamp('2023/06/17 09:00', 'yyyy/mm/dd hh24:mi') and
to_timestamp('2023/06/17 12:00', 'yyyy/mm/dd hh24:mi'))
Where sample_time >
to_timestamp('2023-06-17 02:02:00', 'yyyy-mm-dd hh24:mi:ss')
And sample_time <
to_timestamp('2023-07-22 22:35:00', 'yyyy-mm-dd hh24:mi:ss')),
chains As
(Select instance_number,
session_id,
Level lvl,
sys_connect_by_path('inst ' || instance_number || ' ' || sql_id || ' ' ||
event,
' -> ') path,
connect_by_isleaf isleaf
From ash
Start With event = 'log file sync'
Connect By nocycle(Prior blocking_session = session_id
And Prior blocking_session_serial# = session_serial#
And Prior blocking_inst_id = instance_number
And Prior sample_id = sample_id
/*And ((Prior sample_time) - sample_time Between Interval '-1'
Second And Interval '1' Second)*/))
Select instance_number,
lpad(round(ratio_to_report(Count(*)) over() * 100) || '%', 5, ' ') "%This",
Count(*) samples,
path
From chains
Where isleaf = 1
Group By instance_number, path
Order By samples desc;
- 举例
基于ash查询enq: TX - row lock contention(类似队列锁性质的都可以用这个方法)
- 查看snap_id
select snap_id,
to_char(begin_interval_time, 'yyyymmdd hh24:mi:ss') begin_interval_time
from wrm$_snapshot
where begin_interval_time between to_date('20230601 19', 'yyyymmdd hh24') and
to_date('20230630 20', 'yyyymmdd hh24')
and instance_number = 1
order by 2;
- 根据snap_id, 结果按时间排序
select to_char(sample_time, 'yyyymmdd hh24:mi:ss'), event, count(*)
from dba_hist_active_sess_history
where snap_id = 188
and wait_class <> 'Idle'
and event = 'enq: TX - row lock contention'
and instance_number = 2
group by to_char(sample_time, 'yyyymmdd hh24:mi:ss'), event
order by 1;
- 代入上一步的时间点,查询blocking_session并按count排序
select blocking_session, count(*), event
from dba_hist_active_sess_history
where snap_id = 188
and to_char(sample_time, 'yyyymmdd hh24:mi:ss') = '20230601 12:00:06'
and wait_class <> 'Idle'
and event = 'enq: TX - row lock contention'
group by blocking_session, event
order by 2;
- 代入查到的sid,看在snap范围的blocking_session经历的event
select session_id,
user_id,
sql_id,
blocking_session,
event,
to_char(sample_time, 'yyyymmdd hh24:mi:ss')
from dba_hist_active_sess_history
where snap_id = 188
and to_char(sample_time, 'yyyymmdd hh24:mi:ss') <= '20230601 12:00:06'
and to_char(sample_time, 'yyyymmdd hh24:mi:ss') >= '20230630 11:00:26'
and wait_class <> 'Idle'
and instance_number = 1
and session_id = 2622;
- 发生阻塞的主要SQL
with tmp as
(select /*+ materialize */
distinct sql_id sqlid
from dba_hist_active_sess_history
where snap_id = 188
and to_char(sample_time, 'yyyymmdd hh24:mi:ss') <= '20230601 08:43:45'
and to_char(sample_time, 'yyyymmdd hh24:mi:ss') >= '20230630 08:04:39'
and wait_class <> 'Idle'
and instance_number = 1
and event = 'enq: TX - row lock contention')
select distinct parsing_schema_name, sql_id, sql_text
from v$sql
where sql_id in (select sqlid from tmp);
ash 案例一:
问题概述
数据库行锁未提交导致大量会话阻塞,手动kill会话之后,恢复正常。
问题分析
从ash裸数据分析,可以发现从10:11开始到11:19会话开始逐渐积压
INST_ID TO_CHAR(SAMPLE_TIME,'YYYYMMDD-HH24MI') COUNT(*)
---------- --------------------------------------- ----------
2 20240319-1000 14
2 20240319-1001 26
2 20240319-1002 21
2 20240319-1003 36
2 20240319-1004 28
2 20240319-1005 14
2 20240319-1006 13
2 20240319-1007 32
2 20240319-1008 39
2 20240319-1009 60
2 20240319-1010 98
2 20240319-1011 141
2 20240319-1012 206
2 20240319-1013 123
2 20240319-1014 121
2 20240319-1015 95
2 20240319-1016 194
2 20240319-1017 224
2 20240319-1018 255
2 20240319-1019 352
2 20240319-1020 352
2 20240319-1021 372
2 20240319-1022 382
2 20240319-1023 424
2 20240319-1024 454
2 20240319-1025 431
2 20240319-1026 433
2 20240319-1027 426
2 20240319-1028 427
…………
2 20240319-1048 972
2 20240319-1049 980
2 20240319-1050 967
2 20240319-1051 995
2 20240319-1052 1012
2 20240319-1053 1034
2 20240319-1054 1036
2 20240319-1055 1027
2 20240319-1056 1040
2 20240319-1057 1025
2 20240319-1058 1033
2 20240319-1059 1009
2 20240319-1100 1032
2 20240319-1101 1047
2 20240319-1102 1026
2 20240319-1103 1038
2 20240319-1104 1033
2 20240319-1105 1029
2 20240319-1106 1055
2 20240319-1107 1034
2 20240319-1108 1063
2 20240319-1109 1096
2 20240319-1110 1179
2 20240319-1111 1163
2 20240319-1112 1136
2 20240319-1113 1111
2 20240319-1114 1089
2 20240319-1115 1087
2 20240319-1116 1089
2 20240319-1117 1088
2 20240319-1118 1092
2 20240319-1119 1095
…………
111 rows selected.
由于故障时间内发生在2节点,所以只获取2节点的ash信息,数据库节点2会话从10.11分左右开始激增,11.20左右恢复正常。
查询异常时间段内的等待事件
Enter Search before hours (i.e. 2012-01-01 23:00:00) : 2024-03-19 11:11:00
Enter Search jiange hours (i.e. 123| default 1)) : 1
EVENT TOTAL WAIT_CLASS
-------------------------------------------------- ---------- --------------------
enq: TX - row lock contention 19173 Application
CPU 4003 CPU
Disk file operations I/O 1139 User_IO
enq: TM - contention 1127 Application
BCPU 782 BCPU
row cache lock 60 Concurrency
log file parallel write 57 System_IO
RMAN backup & recovery I/O 54 System_IO
log file sync 38 Commit
DLM cross inst call completion 33 Other
KJC: Wait for msg sends to complete 27 Other
PGA memory operation 25 Other
reliable message 23 Other
oracle thread bootstrap 22 Other
control file sequential read 20 System_IO
JS kgl get object wait 19 Administrative
gc current grant busy 17 Cluster
process diagnostic dump 15 Other
ASM file metadata operation 15 Ot``` language
```her
gc current block busy 15 Cluster
IMR slave acknowledgement msg 14 Other
library cache lock 12 Concurrency
os thread creation 12 Other
rdbms ipc message 11 Idle
db file parallel write 7 System_IO
row cache mutex 6 Concurrency
异常时间段内的等待事件主要为enq: TX - row lock contention,该等待事件通常与表行锁相关。
从ash裸数据查看tx锁阻塞链信息,可以发现在不同时间段内均是由1981会话导致;
查询故障时间段内阻塞链信息
SESSION_ID SQL_ID SAMPLE_TIME CHAIN FINAL_PROGRAM FINAL_BLK FINAL_EVT EVENT
---------- --------------- -------------------- ---------------------------------------- ---------------------------------------- --------- ------------------------------ ----------------------------------------
LVL2_EVENT
------------------------------
248 93xzn8vuqqfut 20240319 102641 ->1981->5->248 oracle@vwatdlorap08 (J000) 1981 ON CPU enq: TX - row lock contention
389 93xzn8vuqqfut 20240319 102641 ->1981->5->389 oracle@vwatdlorap08 (J000) 1981 ON CPU enq: TX - row lock contention
1842 5d6xw0m7hman7 20240319 102641 ->1981->5->1842 oracle@vwatdlorap08 (J000) 1981 ON CPU enq: TX - row lock contention
1848 26xd7u55hmn31 20240319 102641 ->1981->5->1848 oracle@vwatdlorap08 (J000) 1981 ON CPU enq: TX - row lock contention
2670 bgy4dk52kb5bx 20240319 102641 ->1981->5->2670 oracle@vwatdlorap08 (J000) 1981 ON CPU enq: TX - row lock contention
2747 gcvd0vn606km4 20240319 102641 ->1981->5->2747 oracle@vwatdlorap08 (J000) 1981 ON CPU enq: TX - row lock contention
1614 20240319 102641 ->1614 oracle@vwatdlorap09 (MZ00) 1614 ON CPU
1981 dm9pz48khh8c5 20240319 102642 ->1981 oracle@vwatdlorap08 (J000) 1981 ON CPU
5 0magwj2bd6vf5 20240319 102642 ->1981->5 oracle@vwatdlorap08 (J000) 1981 ON CPU enq: TX - row lock contention
enq: TX - row lock contention
248 93xzn8vuqqfut 20240319 102642 ->1981->5->248 oracle@vwatdlorap08 (J000) 1981 ON CPU enq: TX - row lock contention
389 93xzn8vuqqfut 20240319 102642 ->1981->5->389 oracle@vwatdlorap08 (J000) 1981 ON CPU enq: TX - row lock contention
1842 5d6xw0m7hman7 20240319 102642 ->1981->5->1842 oracle@vwatdlorap08 (J000) 1981 ON CPU enq: TX - row lock contention
1848 26xd7u55hmn31 20240319 102642 ->1981->5->1848 oracle@vwatdlorap08 (J000) 1981 ON CPU enq: TX - row lock contention
2670 bgy4dk52kb5bx 20240319 102642 ->1981->5->2670 oracle@vwatdlorap08 (J000) 1981 ON CPU enq: TX - row lock contention
2747 gcvd0vn606km4 20240319 102642 ->1981->5->2747 oracle@vwatdlorap08 (J000) 1981 ON CPU enq: TX - row lock contention
…………
SESSION_ID SQL_ID SAMPLE_TIME CHAIN FINAL_PROGRAM FINAL_BLK FINAL_EVT EVENT
---------- --------------- -------------------- ---------------------------------------- ---------------------------------------- --------- ------------------------------ ------------------------------
LVL2_EVENT
------------------------------
1611 fyd88tk53smq8 20240319 105932 ->1981->5->1611 oracle@vwatdlorap08 (J000) 1981 ON CPU enq: TX - row lock contention
1842 5d6xw0m7hman7 20240319 105932 ->1981->5->1842 oracle@vwatdlorap08 (J000) 1981 ON CPU enq: TX - row lock contention
1848 26xd7u55hmn31 20240319 105932 ->1981->5->1848 oracle@vwatdlorap08 (J000) 1981 ON CPU enq: TX - row lock contention
SESSION_ID SQL_ID SAMPLE_TIME CHAIN FINAL_PROGRAM FINAL_BLK FINAL_EVT EVENT
---------- --------------- -------------------- ---------------------------------------- ---------------------------------------- --------- ------------------------------ ------------------------------
LVL2_EVENT
------------------------------
2369 93xzn8vuqqfut 20240319 105932 ->1981->5->2369 oracle@vwatdlorap08 (J000) 1981 ON CPU enq: TX - row lock contention
2531 93xzn8vuqqfut 20240319 105932 ->1981->5->2531 oracle@vwatdlorap08 (J000) 1981 ON CPU enq: TX - row lock contention
2670 bgy4dk52kb5bx 20240319 105932 ->1981->5->2670 oracle@vwatdlorap08 (J000) 1981 ON CPU enq: TX - row lock contention
查看对应故障时间段内的采样时间点的阻塞链表信息,截取部分时间段的信息。
查看对应被阻塞会话信息,可以看到从10:11开始,1981会话已经开始影响update操作无法正常提交
TIME SESSION_ID SESSION_SERIAL# SQL_OPNAME USER_ID SQL_ID SQL_EXEC_START EVENT TIME_WAITED CURRENT_OBJ# CURRENT_ROW# BLOCKING_SESSION SERIAL BLOCKING_INST_ID
-------------------- ---------- --------------- ---------- ---------- --------------- ------------------- ---------------------------------------- ----------- ------------ ------------ ---------------- ---------- ----------------
20240319-101116 5 28227 UPDATE 117 0magwj2bd6vf5 2024-03-19 10:11:04 enq: TX - row lock contention 0 90535 38 1981 51726 1
20240319-101117 5 28227 UPDATE 117 0magwj2bd6vf5 2024-03-19 10:11:04 enq: TX - row lock contention 0 90535 38 1981 51726 1
20240319-101118 5 28227 UPDATE 117 0magwj2bd6vf5 2024-03-19 10:11:04 enq: TX - row lock contention 0 90535 38 1981 51726 1
20240319-101119 5 28227 UPDATE 117 0magwj2bd6vf5 2024-03-19 10:11:04 enq: TX - row lock contention 0 90535 38 1981 51726 1
20240319-101120 5 28227 UPDATE 117 0magwj2bd6vf5 2024-03-19 10:11:04 enq: TX - row lock contention 0 90535 38 1981 51726 1
20240319-101121 5 28227 UPDATE 117 0magwj2bd6vf5 2024-03-19 10:11:04 enq: TX - row lock contention 0 90535 38 1981 51726 1
20240319-101122 5 28227 UPDATE 117 0magwj2bd6vf5 2024-03-19 10:11:04 enq: TX - row lock contention 0 90535 38 1981 51726 1
20240319-101123 5 28227 UPDATE 117 0magwj2bd6vf5 2024-03-19 10:11:04 enq: TX - row lock contention 0 90535 38 1981 51726 1
20240319-101124 5 28227 UPDATE 117 0magwj2bd6vf5 2024-03-19 10:11:04 enq: TX - row lock contention 0 90535 38 1981 51726 1
20240319-101125 5 28227 UPDATE 117 0magwj2bd6vf5 2024-03-19 10:11:04 enq: TX - row lock contention 0 90535 38 1981 51726 1
20240319-101126 5 28227 UPDATE 117 0magwj2bd6vf5 2024-03-19 10:11:04 enq: TX - row lock contention 0 90535 38 1981 51726 1
.
.
IME SESSION_ID SESSION_SERIAL# SQL_OPNAME USER_ID SQL_ID SQL_EXEC_START EVENT TIME_WAITED CURRENT_OBJ# CURRENT_ROW# BLOCKING_SESSION SERIAL BLOCKING_INST_ID
-------------------- ---------- --------------- ---------- ---------- --------------- ------------------- ---------------------------------------- ----------- ------------ ------------ ---------------- ---------- ----------------
20240319-111955 5 28227 UPDATE 117 0magwj2bd6vf5 2024-03-19 10:11:04 enq: TX - row lock contention 0 90535 38 1981 51726 1
20240319-111956 5 28227 UPDATE 117 0magwj2bd6vf5 2024-03-19 10:11:04 enq: TX - row lock contention 0 90535 38 1981 51726 1
20240319-111957 5 28227 UPDATE 117 0magwj2bd6vf5 2024-03-19 10:11:04 enq: TX - row lock contention 0 90535 38 1981 51726 1
20240319-111958 5 28227 UPDATE 117 0magwj2bd6vf5 2024-03-19 10:11:04 enq: TX - row lock contention 0 90535 38 1981 51726 1
20240319-111959 5 28227 UPDATE 117 0magwj2bd6vf5 2024-03-19 10:11:04 enq: TX - row lock contention 0 90535 38 1981 51726 1
从ash中查询1981会话的相关信息,发现该会话是由job调用,且正在执行递归sql,其中可以发现TOP_LEVEL_SQL_ID 均为0gsgppzm9tgr8。
TIME SESSION_ID SESSION_SERIAL# PROGRAM EVENT USER_ID SQL_ID TOP_LEVEL_SQL_I SQL_EXEC_START TIME_WAITED CURRENT_OBJ# CURRENT_ROW# BLOCKING_SESSION
--------------- ---------- --------------- ------------------------------ ------------------------------ ---------- --------------- --------------- ------------------- ----------- ------------ ------------ ----------------
SERIAL BLOCKING_INST_ID
---------- ----------------
20240319-0930 1981 51726 oracle@test (J000) latch: cache buffers chains 0 gu9y4xpnyra1n 0gsgppzm9tgr8 2024-03-19 09:29:59 8703 91404 0 1608
11652 1
20240319-0942 1981 51726 oracle@ (J000) gc cr block 2-way 0 dm9pz48khh8c5 0gsgppzm9tgr8 2024-03-19 09:42:55 354 96678 0
通过查看视图v$sql信息,可以发现查询0gsgppzm9tgr8其实是在调用对象JD_OEE_CL_MIS,并且2024-03-19 09:22:56发起。
select sql_id, sql_text from v$sql where sql_id = '0gsgppzm9tgr8';
SQL_ID
---------------
SQL_TEXT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
0gsgppzm9tgr8
DECLARE job BINARY_INTEGER := :job; next_date TIMESTAMP WITH TIME ZONE := :mydate; broken BOOLEAN := FALSE; job_name VARCHAR2(128) := :job_name; job_subname VARCHAR2(128) := :job_subname; job_owner VARCHAR2(128) := :job_owner
; job_start TIMESTAMP WITH TIME ZONE := :job_start; job_scheduled_start TIMESTAMP WITH TIME ZONE := :job_scheduled_start; window_start TIMESTAMP WITH TIME ZONE := :window_start; window_end TIMESTAMP WITH TIME ZONE := :window_e
nd; chain_id VARCHAR2(14) := :chainid; credential_owner VARCHAR2(128) := :credown; credential_name VARCHAR2(128) := :crednam; destination_owner VARCHAR2(128) := :destown; destination_name VARCHAR2(128) := :destnam; job_des
t_id varchar2(14) := :jdestid; log_id number := :log_id; BEGIN JD_OEE_CL_MIS; :mydate := next_date; IF broken THEN :b := 1; ELSE :b := 0; END IF; END;
SELECT to_char(sample_time,'yyyymmdd-hh24mi') time,
2 SESSION_ID,
SESSION_SERIAL#,
3 4 SQL_OPNAME,
5 event,
USER_ID,
SQL_ID,TOP_LEVEL_SQL_ID,
SQL_EXEC_START,
6 7 8 9 TIME_WAITED,
CURRENT_OBJ#,
CURRENT_ROW#,
BLOCKING_SESSION,
10 11 12 13 BLOCKING_SESSION_SERIAL# serial, BLOCKING_INST_ID
FROM gv$active_session_history
WHERE sql_id = '0gsgppzm9tgr8'
14 15 16 ORDER BY 1,2,8;
TIME SESSION_ID SESSION_SERIAL# SQL_OPNAME EVENT USER_ID SQL_ID TOP_LEVEL_SQL_ID SQL_EXEC_START TIME_WAITED CURRENT_OBJ# CURRENT_ROW#
-------------------- ---------- --------------- ---------- ---------------------------------------- ---------- --------------- --------------------------------------- ------------------- ----------- ------------ ------------
BLOCKING_SESSION SERIAL BLOCKING_INST_ID
---------------- ---------- ----------------
20240319-0927 1981 51726 PL/SQL EXE 0 0gsgppzm9tgr8 0gsgppzm9tgr8 2024-03-19 09:22:56 0 91404 0
CUTE
解决方案
手工提交,或者kill该会话,核查存储过程及程序代码逻辑,保证数据提交顺序,避免不必要的行锁等待。
ash 案例二:
经过ash及osw信息,在08.02.28-08.03.28这一分钟内,出现了大量log file sync等待事件,查看osw主机资源可以发现该时间段内,io出现了大量读写延迟,建议核查链路以及磁盘是否正常。
1、查看ash裸数据发现,在08:02-08:03内出现了大量log file sycn等待事件,enq: TX - row lock contention等待事件从ash采样时间段内没有出现明显积压;
> sql
SELECT inst_id,
to_char(sample_time,'yyyymmdd-hh24mi'),
count(*),
event
FROM ASH_20240322
WHERE sample_time >=to_date('20240322-0720','yyyymmdd-hh24mi')
AND sample_time <=to_date('20240322-0830','yyyymmdd-hh24mi')
and inst_id = 1
and event in ('enq: TX - row lock contention','log file sync')
GROUP BY inst_id,
to_char(sample_time,'yyyymmdd-hh24mi'),
event HAVING count(*) > 2
ORDER BY 4, 2;
INST_ID TO_CHAR(SAMPLE_TIME,'YYYYMMDD-HH24MI') COUNT(*) EVENT
---------- --------------------------------------- ---------- ------------------------------
1 20240322-0730 422 enq: TX - row lock contention
1 20240322-0731 396 enq: TX - row lock contention
1 20240322-0732 366 enq: TX - row lock contention
1 20240322-0733 352 enq: TX - row lock contention
1 20240322-0734 444 enq: TX - row lock contention
1 20240322-0735 459 enq: TX - row lock contention
1 20240322-0736 391 enq: TX - row lock contention
1 20240322-0737 376 enq: TX - row lock contention
1 20240322-0738 55 enq: TX - row lock contention
1 20240322-0739 319 enq: TX - row lock contention
1 20240322-0740 205 enq: TX - row lock contention
1 20240322-0741 52 enq: TX - row lock contention
1 20240322-0742 128 enq: TX - row lock contention
1 20240322-0743 60 enq: TX - row lock contention
1 20240322-0744 185 enq: TX - row lock contention
1 20240322-0745 312 enq: TX - row lock contention
1 20240322-0746 413 enq: TX - row lock contention
1 20240322-0747 413 enq: TX - row lock contention
1 20240322-0748 372 enq: TX - row lock contention
1 20240322-0749 278 enq: TX - row lock contention
1 20240322-0750 222 enq: TX - row lock contention
1 20240322-0751 324 enq: TX - row lock contention
1 20240322-0752 54 enq: TX - row lock contention
1 20240322-0753 347 enq: TX - row lock contention
1 20240322-0754 68 enq: TX - row lock contention
1 20240322-0755 325 enq: TX - row lock contention
1 20240322-0756 67 enq: TX - row lock contention
1 20240322-0757 279 enq: TX - row lock contention
1 20240322-0758 438 enq: TX - row lock contention
1 20240322-0759 436 enq: TX - row lock contention
1 20240322-0800 425 enq: TX - row lock contention
1 20240322-0801 438 enq: TX - row lock contention
1 20240322-0802 311 enq: TX - row lock contention
1 20240322-0803 579 enq: TX - row lock contention
1 20240322-0804 370 enq: TX - row lock contention
1 20240322-0805 404 enq: TX - row lock contention
1 20240322-0806 444 enq: TX - row lock contention
1 20240322-0807 466 enq: TX - row lock contention
1 20240322-0808 444 enq: TX - row lock contention
1 20240322-0809 238 enq: TX - row lock contention
1 20240322-0810 404 enq: TX - row lock contention
1 20240322-0811 471 enq: TX - row lock contention
1 20240322-0812 427 enq: TX - row lock contention
1 20240322-0813 371 enq: TX - row lock contention
1 20240322-0814 391 enq: TX - row lock contention
1 20240322-0815 194 enq: TX - row lock contention
1 20240322-0816 56 enq: TX - row lock contention
INST_ID TO_CHAR(SAMPLE_TIME,'YYYYMMDD-HH24MI') COUNT(*) EVENT
---------- --------------------------------------- ---------- ------------------------------
1 20240322-0817 346 enq: TX - row lock contention
1 20240322-0818 402 enq: TX - row lock contention
1 20240322-0819 343 enq: TX - row lock contention
1 20240322-0730 12 log file sync
1 20240322-0731 70 log file sync
1 20240322-0732 12 log file sync
1 20240322-0733 15 log file sync
1 20240322-0734 19 log file sync
1 20240322-0735 18 log file sync
1 20240322-0736 15 log file sync
1 20240322-0737 15 log file sync
1 20240322-0738 9 log file sync
1 20240322-0739 15 log file sync
1 20240322-0740 11 log file sync
1 20240322-0741 13 log file sync
1 20240322-0742 16 log file sync
1 20240322-0743 17 log file sync
1 20240322-0744 19 log file sync
1 20240322-0745 8 log file sync
1 20240322-0746 11 log file sync
1 20240322-0747 19 log file sync
1 20240322-0748 14 log file sync
1 20240322-0749 14 log file sync
1 20240322-0750 19 log file sync
1 20240322-0751 14 log file sync
1 20240322-0752 25 log file sync
1 20240322-0753 9 log file sync
1 20240322-0754 11 log file sync
1 20240322-0755 13 log file sync
1 20240322-0756 15 log file sync
1 20240322-0757 15 log file sync
1 20240322-0758 19 log file sync
1 20240322-0759 7 log file sync
1 20240322-0800 17 log file sync
1 20240322-0801 23 log file sync
1 20240322-0802 5155 log file sync
1 20240322-0803 3028 log file sync
1 20240322-0804 12 log file sync
1 20240322-0805 7 log file sync
1 20240322-0806 13 log file sync
1 20240322-0807 14 log file sync
1 20240322-0808 7 log file sync
1 20240322-0809 10 log file sync
1 20240322-0810 12 log file sync
1 20240322-0811 7 log file sync
1 20240322-0812 8 log file sync
1 20240322-0813 6 log file sync
1 20240322-0814 17 log file sync
1 20240322-0815 14 log file sync
1 20240322-0816 9 log file sync
1 20240322-0817 12 log file sync
1 20240322-0818 11 log file sync
1 20240322-0819 18 log file sync
2、截取08:02-08:03故障时间内部分详细等待事件,我们可以发现log file sync均是由于1节点 1765后台lgw进程log file parallel write阻塞;(其中lv代表阻塞层级,isleaf=1表示阻塞源头)
> sql
select
level lv,
connect_by_isleaf isleaf,
connect_by_iscycle iscycle,
t.sample_time,
t.inst_id,
t.session_id,
t.sql_id,
t.session_type,
t.event,
t.session_state,
t.blocking_inst_id blind,
t.blocking_session blsid,
t.blocking_session_status bss
from ASH_20240322 t
where sample_time >
to_timestamp('2024-03-22 08:01:00',
'yyyy-mm-dd hh24:mi:ss')
and sample_time <
to_timestamp('2024-03-22 08:05:00',
'yyyy-mm-dd hh24:mi:ss')
start with blocking_session is not null
connect by nocycle
prior sample_time = sample_time
/*and ((prior sample_time) - sample_time between interval '-1'
second and interval '1' second)*/
and prior blocking_inst_id = inst_id
and prior blocking_session = session_id
and prior blocking_session_serial# = session_serial#
order siblings by sample_time;
LV ISLEAF ISCYCLE SAMPLE_TIME INST_ID SESSION_ID SQL_ID SESSION_TYPE EVENT SESSION_STATE BLIND BLSID BSS
---------- ---------- ---------- ------------------------- ---------- ---------- --------------- --------------- ------------------------------ --------------- ------ ------ ---------------
2 1 0 22-MAR-24 08.03.19.818 AM 1 1765 BACKGROUND log file parallel write WAITING NO HOLDER
1 0 0 22-MAR-24 08.03.19.818 AM 1 2113 FOREGROUND log file sync WAITING 1 1765 VALID
2 1 0 22-MAR-24 08.03.19.818 AM 1 1765 BACKGROUND log file parallel write WAITING NO HOLDER
1 0 0 22-MAR-24 08.03.19.818 AM 1 2072 FOREGROUND log file sync WAITING 1 1765 VALID
2 1 0 22-MAR-24 08.03.19.818 AM 1 1765 BACKGROUND log file parallel write WAITING NO HOLDER
1 0 0 22-MAR-24 08.03.19.818 AM 1 2066 FOREGROUND log file sync WAITING 1 1765 VALID
2 1 0 22-MAR-24 08.03.19.818 AM 1 1765 BACKGROUND log file parallel write WAITING NO HOLDER
1 0 0 22-MAR-24 08.03.19.818 AM 1 2033 FOREGROUND log file sync WAITING 1 1765 VALID
2 1 0 22-MAR-24 08.03.19.818 AM 1 1765 BACKGROUND log file parallel write WAITING NO HOLDER
1 0 0 22-MAR-24 08.03.19.818 AM 1 2032 FOREGROUND log file sync WAITING 1 1765 VALID
2 1 0 22-MAR-24 08.03.19.818 AM 1 1765 BACKGROUND log file parallel write WAITING NO HOLDER
1 0 0 22-MAR-24 08.03.19.818 AM 1 2023 FOREGROUND log file sync WAITING 1 1765 VALID
2 1 0 22-MAR-24 08.03.19.818 AM 1 1765 BACKGROUND log file parallel write WAITING NO HOLDER
1 0 0 22-MAR-24 08.03.19.818 AM 1 2022 FOREGROUND log file sync WAITING 1 1765 VALID
2 1 0 22-MAR-24 08.03.19.818 AM 1 1765 BACKGROUND log file parallel write WAITING NO HOLDER
1 0 0 22-MAR-24 08.03.19.818 AM 1 1978 FOREGROUND log file sync WAITING 1 1765 VALID
2 1 0 22-MAR-24 08.03.19.818 AM 1 1765 BACKGROUND log file parallel write WAITING NO HOLDER
1 0 0 22-MAR-24 08.03.19.818 AM 1 1976 FOREGROUND log file sync WAITING 1 1765 VALID
2 1 0 22-MAR-24 08.03.19.818 AM 1 1765 BACKGROUND log file parallel write WAITING NO HOLDER
1 0 0 22-MAR-24 08.03.19.818 AM 1 1950 FOREGROUND log file sync WAITING 1 1765 VALID
2 1 0 22-MAR-24 08.03.19.818 AM 1 1765 BACKGROUND log file parallel write WAITING NO HOLDER
1 0 0 22-MAR-24 08.03.19.818 AM 1 1940 FOREGROUND log file sync WAITING 1 1765 VALID
2 1 0 22-MAR-24 08.03.19.818 AM 1 1765 BACKGROUND log file parallel write WAITING NO HOLDER
1 0 0 22-MAR-24 08.03.19.818 AM 1 1939 FOREGROUND log file sync WAITING 1 1765 VALID
2 1 0 22-MAR-24 08.03.19.818 AM 1 1765 BACKGROUND log file parallel write WAITING NO HOLDER
1 0 0 22-MAR-24 08.03.19.818 AM 1 1936 FOREGROUND log file sync WAITING 1 1765 VALID
2 1 0 22-MAR-24 08.03.19.818 AM 1 1765 BACKGROUND log file parallel write WAITING NO HOLDER
1 0 0 22-MAR-24 08.03.19.818 AM 1 1907 FOREGROUND log file sync WAITING 1 1765 VALID
2 1 0 22-MAR-24 08.03.19.818 AM 1 1765 BACKGROUND log file parallel write WAITING NO HOLDER
1 0 0 22-MAR-24 08.03.19.818 AM 1 1898 FOREGROUND log file sync WAITING 1 1765 VALID
2 1 0 22-MAR-24 08.03.19.818 AM 1 1765 BACKGROUND log file parallel write WAITING NO HOLDER
1 0 0 22-MAR-24 08.03.19.818 AM 1 1897 FOREGROUND log file sync WAITING 1 1765 VALID
2 1 0 22-MAR-24 08.03.19.818 AM 1 1765 BACKGROUND log file parallel write WAITING NO HOLDER
1 0 0 22-MAR-24 08.03.19.818 AM 1 1869 FOREGROUND log file sync WAITING 1 1765 VALID
2 1 0 22-MAR-24 08.03.19.818 AM 1 1765 BACKGROUND log file parallel write WAITING NO HOLDER
1 0 0 22-MAR-24 08.03.19.818 AM 1 1852 FOREGROUND log file sync WAITING 1 1765 VALID
2 1 0 22-MAR-24 08.03.19.818 AM 1 1765 BACKGROUND log file parallel write WAITING NO HOLDER
1 0 0 22-MAR-24 08.03.19.818 AM 1 1808 FOREGROUND log file sync WAITING 1 1765 VALID
2 1 0 22-MAR-24 08.03.19.818 AM 1 1765 BACKGROUND log file parallel write WAITING NO HOLDER
1 0 0 22-MAR-24 08.03.19.818 AM 1 1743 FOREGROUND log file sync WAITING 1 1765 VALID
2 1 0 22-MAR-24 08.03.19.818 AM 1 1765 BACKGROUND log file parallel write WAITING NO HOLDER
1 0 0 22-MAR-24 08.03.19.818 AM 1 1740 FOREGROUND log file sync WAITING 1 1765 VALID
2 1 0 22-MAR-24 08.03.19.818 AM 1 1765 BACKGROUND log file parallel write WAITING NO HOLDER
1 0 0 22-MAR-24 08.03.19.818 AM 1 1726 FOREGROUND log file sync WAITING 1 1765 VALID
2 1 0 22-MAR-24 08.03.19.818 AM 1 1765 BACKGROUND log file parallel write WAITING NO HOLDER
1 0 0 22-MAR-24 08.03.19.818 AM 1 1724 FOREGROUND log file sync WAITING 1 1765 VALID
2 1 0 22-MAR-24 08.03.19.818 AM 1 1765 BACKGROUND log file parallel write WAITING NO HOLDER
3、进一步查看08:02-08:03这段时间阻塞情况,可以发现在阻塞最为严重的时刻,每秒阻塞了200+的会话,阻塞相当严重;
SQL>
select t.lv,
t.iscycle,
t.sample_time,
t.inst_id,
t.session_id,
t.sql_id,
t.session_type,
t.event,
t.seq#,
t.session_state,
t.blocking_inst_id blind,
t.blocking_session blsid,
t.blocking_session_status bss,
t.c cnt
from (select t.*,
row_number() over(partition by inst_id, sample_time order by c desc) r
from (select t.*,
count(*) over(partition by inst_id, sample_time, session_id) c,
row_number() over(partition by inst_id, sample_time, session_id order by 1) r1
from (select
level lv,
connect_by_isleaf isleaf,
connect_by_iscycle iscycle,
t.*
from ASH_20240322 t
where sample_time >
to_timestamp('2024-03-22 07:55:00',
'yyyy-mm-dd hh24:mi:ss')
and sample_time <
to_timestamp('2024-03-22 08:10:00',
'yyyy-mm-dd hh24:mi:ss')
start with blocking_session is not null
connect by nocycle
prior sample_time = sample_time
/*and ((prior sample_time) - sample_time between interval '-1'
second and interval '1' second)*/
and prior blocking_inst_id = inst_id
and prior blocking_session = session_id
and prior
blocking_session_serial# = session_serial#) t
where t.isleaf = 1) t
where r1 = 1) t
where session_id = 1765 and r < 3
order by sample_time, r;
LV ISCYCLE SAMPLE_TIME INST_ID SESSION_ID SQL_ID SESSION_TYPE EVENT SEQ# SESSION_STATE BLIND BLSID BSS CNT
---------- ---------- ------------------------- ---------- ---------- --------------- --------------- ------------------------------ ---------- --------------- ------ ------ ------------------------- ----------
2 0 22-MAR-24 08.02.08.745 AM 1765 BACKGROUND log file parallel write 37197 WAITING NO HOLDER 3
2 0 22-MAR-24 08.02.20.757 AM 1765 BACKGROUND 2138 ON CPU NOT IN WAIT 1
2 0 22-MAR-24 08.02.21.758 AM 1765 BACKGROUND log file parallel write 4338 WAITING NO HOLDER 1
2 0 22-MAR-24 08.02.28.765 AM 1765 BACKGROUND log file parallel write 19057 WAITING NO HOLDER 10
2 0 22-MAR-24 08.02.29.766 AM 1765 BACKGROUND log file parallel write 19060 WAITING NO HOLDER 52
2 0 22-MAR-24 08.02.30.767 AM 1765 BACKGROUND log file parallel write 19064 WAITING NO HOLDER 57
2 0 22-MAR-24 08.02.31.768 AM 1 1765 BACKGROUND log file parallel write 19066 WAITING NO HOLDER 95
2 0 22-MAR-24 08.02.32.769 AM 1765 BACKGROUND log file parallel write 19069 WAITING NO HOLDER 88
2 0 22-MAR-24 08.02.33.770 AM 1765 BACKGROUND log file parallel write 19073 WAITING NO HOLDER 135
2 0 22-MAR-24 08.02.34.771 AM 1765 BACKGROUND log file parallel write 19074 WAITING NO HOLDER 184
2 0 22-MAR-24 08.02.35.772 AM 1765 BACKGROUND log file parallel write 19075 WAITING NO HOLDER 208
2 0 22-MAR-24 08.02.36.774 AM 1765 BACKGROUND log file parallel write 19076 WAITING NO HOLDER 197
2 0 22-MAR-24 08.02.37.775 AM 1765 BACKGROUND log file parallel write 19077 WAITING NO HOLDER 186
2 0 22-MAR-24 08.02.38.776 AM 1765 BACKGROUND log file parallel write 19080 WAITING NO HOLDER 161
2 0 22-MAR-24 08.02.39.777 AM 1765 BACKGROUND log file parallel write 19901 WAITING NO HOLDER 190
2 0 22-MAR-24 08.02.40.778 AM 1765 BACKGROUND log file parallel write 19901 WAITING NO HOLDER 196
2 0 22-MAR-24 08.02.41.779 AM 1765 BACKGROUND log file parallel write 19902 WAITING NO HOLDER 195
2 0 22-MAR-24 08.02.42.780 AM 1765 BACKGROUND log file parallel write 19902 WAITING NO HOLDER 198
2 0 22-MAR-24 08.02.43.781 AM 1765 BACKGROUND log file parallel write 19903 WAITING NO HOLDER 201
2 0 22-MAR-24 08.02.44.782 AM 1765 BACKGROUND log file parallel write 19903 WAITING NO HOLDER 204
2 0 22-MAR-24 08.02.45.783 AM 1765 BACKGROUND log file parallel write 19904 WAITING NO HOLDER 202
2 0 22-MAR-24 08.02.46.784 AM 1765 BACKGROUND log file parallel write 19905 WAITING NO HOLDER 199
2 0 22-MAR-24 08.02.47.785 AM 1765 BACKGROUND log file parallel write 19907 WAITING NO HOLDER 177
2 0 22-MAR-24 08.02.48.787 AM 1765 BACKGROUND log file parallel write 19909 WAITING NO HOLDER 166
2 0 22-MAR-24 08.02.49.788 AM 1765 BACKGROUND log file parallel write 19911 WAITING NO HOLDER 168
2 0 22-MAR-24 08.02.50.789 AM 1765 BACKGROUND log file parallel write 19913 WAITING NO HOLDER 172
2 0 22-MAR-24 08.02.51.790 AM 1765 BACKGROUND log file parallel write 19916 WAITING NO HOLDER 160
2 0 22-MAR-24 08.02.52.791 AM 1765 BACKGROUND log file parallel write 19918 WAITING NO HOLDER 26
2 0 22-MAR-24 08.02.53.792 AM 1765 BACKGROUND log file parallel write 19920 WAITING NO HOLDER 186
2 0 22-MAR-24 08.02.54.793 AM 1765 BACKGROUND log file parallel write 19923 WAITING NO HOLDER 188
2 0 22-MAR-24 08.02.55.794 AM 1765 BACKGROUND log file parallel write 19924 WAITING NO HOLDER 190
2 0 22-MAR-24 08.02.56.795 AM 1765 BACKGROUND log file parallel write 19927 WAITING NO HOLDER 192
2 0 22-MAR-24 08.02.57.796 AM 1765 BACKGROUND log file parallel write 19933 WAITING NO HOLDER 202
2 0 22-MAR-24 08.02.58.797 AM 1765 BACKGROUND log file parallel write 19956 WAITING NO HOLDER 194
2 0 22-MAR-24 08.02.59.798 AM 1765 BACKGROUND log file parallel write 19960 WAITING NO HOLDER 200
2 0 22-MAR-24 08.03.00.799 AM 1765 BACKGROUND log file parallel write 19962 WAITING NO HOLDER 201
2 0 22-MAR-24 08.03.01.800 AM 1765 BACKGROUND log file parallel write 19967 WAITING NO HOLDER 205
2 0 22-MAR-24 08.03.02.801 AM 1765 BACKGROUND log file parallel write 19968 WAITING NO HOLDER 212
2 0 22-MAR-24 08.03.03.802 AM 1765 BACKGROUND control file parallel write 19982 WAITING UNKNOWN 209
2 0 22-MAR-24 08.03.04.803 AM 1765 BACKGROUND log file parallel write 20263 WAITING NO HOLDER 228
2 0 22-MAR-24 08.03.05.804 AM 1765 BACKGROUND log file parallel write 20263 WAITING NO HOLDER 236
2 0 22-MAR-24 08.03.06.805 AM 1765 BACKGROUND log file parallel write 20265 WAITING NO HOLDER 212
2 0 22-MAR-24 08.03.07.806 AM 1765 BACKGROUND log file parallel write 20266 WAITING NO HOLDER 211
2 0 22-MAR-24 08.03.08.807 AM 1765 BACKGROUND log file parallel write 20267 WAITING NO HOLDER 216
2 0 22-MAR-24 08.03.09.808 AM 1765 BACKGROUND log file parallel write 20270 WAITING NO HOLDER 192
2 0 22-MAR-24 08.03.10.809 AM 1765 BACKGROUND log file parallel write 20274 WAITING NO HOLDER 189
2 0 22-MAR-24 08.03.11.810 AM 1765 BACKGROUND log file parallel write 20278 WAITING NO HOLDER 192
2 0 22-MAR-24 08.03.12.811 AM 1765 BACKGROUND log file parallel write 20379 WAITING NO HOLDER 174
2 0 22-MAR-24 08.03.13.812 AM 1765 BACKGROUND log file parallel write 20384 WAITING NO HOLDER 185
2 0 22-MAR-24 08.03.14.813 AM 1765 BACKGROUND log file parallel write 21577 WAITING NO HOLDER 51
2 0 22-MAR-24 08.03.15.814 AM 1765 BACKGROUND log file parallel write 21935 WAITING NO HOLDER 180
2 0 22-MAR-24 08.03.16.815 AM 1765 BACKGROUND log file parallel write 22109 WAITING NO HOLDER 129
2 0 22-MAR-24 08.03.17.816 AM 1765 BACKGROUND log file parallel write 22178 WAITING NO HOLDER 11
2 0 22-MAR-24 08.03.18.817 AM 1 1765 BACKGROUND log file parallel write 22434 WAITING NO HOLDER 153
2 0 22-MAR-24 08.03.19.818 AM 1765 BACKGROUND log file parallel write 22666 WAITING NO HOLDER 135
2 0 22-MAR-24 08.03.20.819 AM 1765 BACKGROUND log file parallel write 23447 WAITING NO HOLDER 7
2 0 22-MAR-24 08.03.22.821 AM 1765 BACKGROUND log file parallel write 29680 WAITING NO HOLDER 6
2 0 22-MAR-24 08.03.23.821 AM 1765 BACKGROUND log file parallel write 30602 WAITING NO HOLDER 6
2 0 22-MAR-24 08.03.24.822 AM 1765 BACKGROUND log file parallel write 31405 WAITING NO HOLDER 16
2 0 22-MAR-24 08.03.28.826 AM 1765 BACKGROUND log file parallel write 36412 WAITING NO HOLDER 34
2 0 22-MAR-24 08.03.48.845 AM 1765 BACKGROUND log file parallel write 12062 WAITING NO HOLDER 1
2 0 22-MAR-24 08.04.08.865 AM 1765 BACKGROUND log file parallel write 51481 WAITING NO HOLDER 1
4、接着我们分析osw信息,查看主机io资源,可以发现在故障时间内dm共享磁盘出现了大量读写延迟,部分磁盘甚至达到了磁盘性能上线,其他时间内均未发现;建议排查io链路及磁盘情况;
zzz ***Fri Mar 22 08:02:24 CST 2024
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
dm-0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-3 0.00 0.00 1.98 1.98 0.99 127.23 64.75 0.00 0.25 0.00 0.50 0.25 0.10
dm-4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-5 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-6 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-7 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-8 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-9 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-10 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-11 0.00 260.40 1.98 568.32 31.68 9703.96 34.14 0.13 0.23 0.00 0.23 0.23 13.27
dm-12 0.00 0.00 1.98 0.99 0.99 0.50 1.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-13 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-14 0.00 0.00 1.98 1.98 0.99 127.23 64.75 0.00 0.25 0.50 0.00 0.25 0.10
dm-15 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-16 0.00 260.40 0.00 568.32 0.00 9703.96 34.15 0.14 0.25 0.00 0.25 0.25 14.16
dm-17 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-18 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-19 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-20 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-21 0.00 0.00 0.00 0.99 0.00 3.96 8.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-22 0.00 0.00 0.00 3.96 0.00 15.84 8.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-23 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-24 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-25 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-26 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-27 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-28 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-29 0.00 0.00 0.00 0.99 0.00 31.68 64.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-30 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-31 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-32 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-33 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-34 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
zzz ***Fri Mar 22 08:02:29 CST 2024
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
dm-0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-3 0.00 0.00 2.00 2.00 1.00 148.50 74.75 0.12 63.50 0.00 127.00 30.00 12.00
dm-4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-5 0.00 0.00 1.00 2.00 8.00 136.00 96.00 0.28 93.00 24.00 127.50 93.00 27.90
dm-6 0.00 0.00 0.00 4.00 0.00 240.00 120.00 0.42 106.00 0.00 106.00 106.00 42.40
dm-7 0.00 0.00 0.00 5.00 0.00 280.00 112.00 0.28 103.40 0.00 103.40 28.80 14.40
dm-8 0.00 0.00 0.00 7.00 0.00 656.00 187.43 0.38 0.29 0.00 0.29 53.86 37.70
dm-9 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-10 0.00 0.00 2.00 5.00 8.50 280.00 82.43 0.39 66.43 11.00 88.60 45.00 31.50
dm-11 0.00 13.00 3.00 4.00 80.00 2909.00 854.00 1.03 165.57 13.00 280.00 142.86 100.00
dm-12 0.00 0.00 2.00 2.00 1.00 4.50 2.75 0.21 64.75 0.50 129.00 53.50 21.40
dm-13 0.00 0.00 0.00 14.00 0.00 632.00 90.29 0.24 17.21 0.00 17.21 17.14 24.00
dm-14 0.00 0.00 2.00 1.00 1.00 0.50 1.00 0.15 51.00 0.00 153.00 51.00 15.30
dm-15 0.00 0.00 3.00 5.00 24.00 404.00 107.00 0.43 63.00 89.00 47.40 54.12 43.30
dm-16 0.00 13.00 0.00 5.00 0.00 3208.50 1283.40 0.26 52.80 0.00 52.80 51.60 25.80
dm-17 0.00 0.00 0.00 7.00 0.00 352.00 100.57 0.77 97.14 0.00 97.14 99.86 69.90
dm-18 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-19 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-20 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-21 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-22 0.00 0.00 0.00 4.00 0.00 16.00 8.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-23 0.00 0.00 1.00 5.00 8.00 288.00 98.67 0.23 37.83 30.00 39.40 37.67 22.60
dm-24 0.00 0.00 0.00 1.00 0.00 104.00 208.00 0.18 183.00 0.00 183.00 183.00 18.30
dm-25 0.00 0.00 0.00 7.00 0.00 592.00 169.14 0.52 73.57 0.00 73.57 56.29 39.40
dm-26 0.00 0.00 0.00 3.00 0.00 64.00 42.67 0.24 0.00 0.00 0.00 79.67 23.90
dm-27 0.00 0.00 1.00 4.00 8.00 48.00 22.40 0.37 74.40 148.00 56.00 74.20 37.10
dm-28 0.00 0.00 0.00 7.00 0.00 472.00 134.86 0.68 117.14 0.00 117.14 52.71 36.90
dm-29 0.00 0.00 2.00 7.00 16.00 656.00 149.33 0.56 43.33 17.50 50.71 47.89 43.10
dm-30 0.00 0.00 0.00 11.00 0.00 904.00 164.36 1.03 61.91 0.00 61.91 55.27 60.80
dm-31 0.00 0.00 0.00 4.00 0.00 280.00 140.00 0.33 0.00 0.00 0.00 83.00 33.20
dm-32 0.00 0.00 0.00 7.00 0.00 424.00 121.14 0.52 74.14 0.00 74.14 74.00 51.80
dm-33 0.00 0.00 1.00 4.00 8.00 272.00 112.00 0.13 53.20 13.00 63.25 25.60 12.80
dm-34 0.00 0.00 1.00 6.00 8.00 296.00 86.86 0.47 74.29 516.00 0.67 66.14 46.30
…………
zzz ***Fri Mar 22 08:03:09 CST 2024
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
dm-0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-3 0.00 0.00 4.00 4.00 2.00 168.50 42.62 0.17 4.50 9.00 0.00 16.75 13.40
dm-4 0.00 0.00 19.00 39.00 76.00 38916.00 1344.55 1.27 34.53 0.11 51.31 3.07 17.80
dm-5 0.00 0.00 0.00 8.00 0.00 600.00 150.00 0.53 65.75 0.00 65.75 63.75 51.00
dm-6 0.00 0.00 0.00 4.00 0.00 248.00 124.00 0.27 66.50 0.00 66.50 66.25 26.50
dm-7 0.00 0.00 0.00 6.00 0.00 208.00 69.33 0.64 89.83 0.00 89.83 76.33 45.80
dm-8 0.00 0.00 3.00 14.00 24.00 1048.00 126.12 0.47 27.53 5.00 32.36 15.18 25.80
dm-9 0.00 0.00 0.00 30.00 0.00 28680.00 1912.00 7.77 249.10 0.00 249.10 33.37 100.10
dm-10 0.00 0.00 0.00 11.00 0.00 496.00 90.18 0.35 40.64 0.00 40.64 31.82 35.00
dm-11 0.00 9.00 76.00 4.00 54424.00 1667.50 1402.29 0.70 4.19 1.68 51.75 6.21 49.70
dm-12 0.00 0.00 22.00 3.00 74.00 33.00 8.56 0.00 0.08 0.09 0.00 0.08 0.20
dm-13 0.00 0.00 0.00 8.00 0.00 268.00 67.00 0.68 96.88 0.00 96.88 85.00 68.00
dm-14 0.00 0.00 2.00 4.00 1.00 140.50 47.17 0.70 117.17 0.00 175.75 67.83 40.70
dm-15 0.00 0.00 0.00 13.00 0.00 756.00 116.31 0.03 9.54 0.00 9.54 1.92 2.50
dm-16 0.00 9.00 18.00 5.00 72.00 4063.00 359.57 1.00 44.00 0.11 202.00 43.43 99.90
dm-17 0.00 0.00 0.00 8.00 0.00 424.00 106.00 0.09 0.62 0.00 0.62 11.25 9.00
dm-18 0.00 0.00 0.00 16.00 0.00 401.50 50.19 0.00 0.00 0.00 0.00 0.00 0.00
dm-19 0.00 0.00 0.00 5.00 0.00 52.50 21.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-20 0.00 0.00 0.00 5.00 0.00 292.50 117.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-21 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-22 0.00 0.00 0.00 42.00 0.00 514.00 24.48 0.00 0.00 0.00 0.00 0.00 0.00
dm-23 0.00 0.00 0.00 9.00 0.00 424.00 94.22 0.23 25.89 0.00 25.89 25.89 23.30
dm-24 0.00 0.00 0.00 5.00 0.00 528.00 211.20 0.66 131.00 0.00 131.00 99.00 49.50
dm-25 0.00 0.00 18.00 7.00 72.00 512.00 46.72 0.00 0.16 0.11 0.29 0.16 0.40
dm-26 0.00 0.00 0.00 7.00 0.00 184.00 52.57 0.42 69.29 0.00 69.29 40.43 28.30
dm-27 0.00 0.00 20.00 9.00 80.50 552.00 43.62 0.02 0.59 0.65 0.44 0.59 1.70
dm-28 0.00 0.00 0.00 9.00 0.00 688.00 152.89 0.86 87.00 0.00 87.00 85.89 77.30
dm-29 0.00 0.00 0.00 13.00 0.00 1088.00 167.38 0.70 49.15 0.00 49.15 32.46 42.20
dm-30 0.00 0.00 0.00 8.00 0.00 896.00 224.00 0.39 48.38 0.00 48.38 48.38 38.70
dm-31 0.00 0.00 0.00 8.00 0.00 592.00 148.00 0.13 25.62 0.00 25.62 16.75 13.40
dm-32 0.00 0.00 0.00 12.00 0.00 528.00 88.00 0.56 24.50 0.00 24.50 31.83 38.20
dm-33 0.00 0.00 0.00 9.00 0.00 200.00 44.44 0.68 75.33 0.00 75.33 75.33 67.80
dm-34 0.00 0.00 0.00 4.00 0.00 56.00 28.00 0.09 0.25 0.00 0.25 23.25 9.30