gv$active_session_history会每秒钟将数据库所有节点的Active Session采样一次,而dba_hist_active_sess_history则会将gv$active_session_history里的数据每10秒采样一次并持久化保存。

基于这个特征,我们可以通过分析dba_hist_active_sess_history的Session采样情况,来定位问题发生的准确时间范围,并且可以观察每个采样点的top event和top holder。


一些容易误解含义和需要关注的列说明

SEQ#

sid和serial#相同的event并不代表是同一个event等待,有时候查看等待事件需要关注这个列,这列值相同代表相同的event上的等待。

SESSION_STATE

waiting代表等待,此时结合event才可以说明等待在这个等待事件上,否则则是on cpu。这个问题在v$session上会经常出现,v$session.state不是waiting时,event代表上个已经经历过的等待,需要注意区分。

WAIT_TIME

当等于0时,表示当前在等待。当这个值是非0值时,代表上个event的等待时间。

TIME_WAITED

结合session_state, 当session_states值是waiting时,代表花在等待上的时间,单位是microseconds,如果采样的数据不止一行,则在最后一行统计累积时间。

BLOCKING_SESSION、BLOCKING_INST_ID

阻塞者会话和阻塞者实例,对于有些等待这个值是不准的,还是要根据具体的P1和P2值的含义去计算

TOP_LEVEL_SQL_ID

top level调用,有时候可以根据这个列查出是什么存储过程中调用的SQL。

找到ASH活动会话最高的SAMPLE_ID

select sample_id,sample_time,count(*)  from v$active_session_history awhere sample_time between to_date('202303231334','yyyymmddhh24miss') and to_date('202303231336','yyyymmddhh24miss') group by sample_id,sample_time order by 3 desc;

查看TOP 等待和TOP SQL

select event,count(*)  from v$active_session_history awhere sample_id in (235564310) group by event order by 2 desc;
select sql_id,count(*)  from v$active_session_history awhere sample_id in (235564310) group by sql_id--,sample_time--,sql_exec_start order by 2 desc;

和昨天对比,TOP SQL执行量是否突变

select trunc(sample_time,'mi'),sql_id,count(*) from v$active_session_historywhere (sample_time between to_date('202303221300','yyyymmddhh24mi') and to_date('202303221340','yyyymmddhh24mi') or sample_time between to_date('202303231300','yyyymmddhh24mi') and to_date('202303231340','yyyymmddhh24mi'))and sql_id='90xxxxxxxxxx'group by sql_id,trunc(sample_time,'mi')order by 1;

看执行时间是否异常,是不是绑定变量的影响【查看一条SQL慢是否和绑定变量有关系】

隔五分钟执行两次,记录结果

select sysdate,sql_id,elapsed_time,buffer_gets,executions from v$sqlstats a where sql_id='90xxxxxxxxxx';
SYSDATE                SQL_ID            ELAPSED_TIME    BUFFER_GETS      EXECUTIONS
2023/3/28 10:31:18    90xxxxxxxxxx       768988772      113364669         1054100
2023/3/28 10:36:34    90xxxxxxxxxx       793996070      117662867         1092692
select (1092692-1054100) as excutions,round((117662867-113364669)/(1092692-1054100)) as per_gets,round((793996070-768988772)/(1092692-1054100)) as per_time from dual;EXCUTIONS    PER_GETS    PER_TIME          38592             111                648

和ASH的对比,如果和ASH慢的时间相差很大,说明是和绑定变量有关系

select * from (select sql_id,max(sample_time),max(sample_time - SQL_EXEC_START) as duration,sql_exec_id,sql_plan_hash_value,sql_opname  from v$active_session_history awhere sample_time between to_date('202304171441','yyyymmddhh24miss') and to_date('202304171459','yyyymmddhh24miss')   and sql_id in ('90xxxxxxxxxx')  group by sql_id,sql_exec_id,sql_plan_hash_value,sql_opname  ) where duration is not null  order by 2


创建ASH临时分析表

依据gv$active_session_history

create table m_ash_0728 tablespace users as
select *
  from gv$active_session_history
 where sample_time between
       to_timestamp('2023/07/28 13:00', 'yyyy/mm/dd hh24:mi') and
       to_timestamp('2023/07/28 15:00', 'yyyy/mm/dd hh24:mi');

 create table m_sqltext tablespace users as select * from gv$sql;

依据dba_hist_active_sess_history

create table m_ash_0728 tablespace users as
select *
  from dba_hist_active_sess_history
 where sample_time between
       to_timestamp('2023/07/28 13:00', 'yyyy/mm/dd hh24:mi') and
       to_timestamp('2023/07/28 15:00', 'yyyy/mm/dd hh24:mi');

create table m_sqltext tablespace users as select * from dba_hist_sqltext;

导出

exp userid=\'/ as sysdba\' file=ashdata.dmp tables='m_ash_0728','m_sqltext'

导入

imp \'/ as sysdba\' file=ashdata.dmp log=ashdata.log full=y

常用确认问题流程(干货)

1.查看ASH时间范围
set line 160 pages 999
col min_time for a25
col max_time for a25
alter session set nls_timestamp_format='yyyy-mm-dd hh24:mi:ss.ff';

select instance_number,
       min(sample_time) min_time,
       max(sample_time) max_time,
       count(*) session_count
  from (select *
          from dba_hist_active_sess_history
         where sample_time between
               to_timestamp('2023/06/17 09:00', 'yyyy/mm/dd hh24:mi') and
               to_timestamp('2023/06/17 12:00', 'yyyy/mm/dd hh24:mi'))
 group by instance_number
 order by instance_number;
2.确认问题发生的精确时间范围
set line 160 pages 999
col sample_time for a25
alter session set nls_timestamp_format='yyyy-mm-dd hh24:mi:ss.ff';

select instance_number, sample_id, sample_time, count(*) session_count
  from (select *
          from dba_hist_active_sess_history
         where sample_time between
               to_timestamp('2023/06/17 09:00', 'yyyy/mm/dd hh24:mi') and
               to_timestamp('2023/06/17 12:00', 'yyyy/mm/dd hh24:mi')) t
 group by instance_number, sample_id, sample_time
 order by instance_number, sample_time;
3.确定每个采样点的 top n event
Select t.sample_id,
       t.sample_time,
       t.instance_number,
       t.event,
       t.session_state,
       t.c session_count
  From (Select t.*,
               rank() over(Partition By instance_number, sample_time Order By c Desc) r
          From (Select /*+ parallel(8) */
                 t.*,
                 Count(*) over(Partition By instance_number, sample_time, event) c,
                 row_number() over(Partition By instance_number, sample_time, event Order By 1) r1
                  From (select *
                          from dba_hist_active_sess_history
                         where sample_time between
                               to_timestamp('2023/06/17 09:00',
                                            'yyyy/mm/dd hh24:mi') and
                               to_timestamp('2023/06/17 12:00',
                                            'yyyy/mm/dd hh24:mi')) t
                 Where sample_time >
                       to_timestamp('2023/06/17 01:25:01',
                                    'yyyy-mm-dd hh24:mi:ss')
                      
                   And sample_time <
                      
                       to_timestamp('2023/06/17 12:42:05',
                                    'yyyy-mm-dd hh24:mi:ss')) t
         Where r1 = 1) t
 Where r < 5
 Order By instance_number, sample_time, r;

ASH_故障排查

  1. 观察每个采样点的等待链
Select /*+parallel(8)*/
 Level                     lv,
 connect_by_isleaf         isleaf,
 connect_by_iscycle        iscycle,
 t.sample_id,
 t.sample_time,
 t.instance_number,
 t.session_id,
 t.sql_id,
 t.session_type,
 t.event,
 t.session_state,
 t.blocking_inst_id,
 t.blocking_session,
 t.blocking_session_status
  From (select *
          from dba_hist_active_sess_history
         where sample_time between
               to_timestamp('2023/06/17 09:00', 'yyyy/mm/dd hh24:mi') and
               to_timestamp('2023/06/17 12:00', 'yyyy/mm/dd hh24:mi')) t
 Where sample_time >
       to_timestamp('2023-06-17 06.02.08', 'yyyy-mm-dd hh24:mi:ss')
   And sample_time <
       to_timestamp('2023-06-17 22.32.31', 'yyyy-mm-dd hh24:mi:ss')
 Start With blocking_session Is Not Null
Connect By nocycle
 Prior sample_time = sample_time
          /*and ((prior sample_time) - sample_time between interval '-1'
          second and interval '1' second)*/
       And Prior blocking_inst_id = instance_number
       And Prior blocking_session = session_id
       And Prior blocking_session_serial# = session_serial#
 Order Siblings By sample_time;
  1. 基于上一步的原理来找出每个采样点的最终top holder
Select t.lv,
       t.iscycle,
       t.sample_id,
       t.sample_time,
       t.instance_number,
       t.session_id,
       t.sql_id,
       t.session_type,
       t.event,
       t.seq#,
       t.session_state,
       t.blocking_inst_id,
       t.blocking_session,
       t.blocking_session_status,
       t.c blocking_session_count
  From (Select t.*,
               row_number() over(Partition By instance_number, sample_time Order By c Desc) r
          From (Select t.*,
                       Count(*) over(Partition By instance_number, sample_time, session_id) c,
                       row_number() over(Partition By instance_number, sample_time, session_id Order By 1) r1
                  From (Select /*+ parallel(8) */
                         Level              lv,
                         connect_by_isleaf  isleaf,
                         connect_by_iscycle iscycle,
                         t.*
                          From (select *
                                  from dba_hist_active_sess_history
                                 where sample_time between
                                       to_timestamp('2023/06/17 09:00',
                                                    'yyyy/mm/dd hh24:mi') and
                                       to_timestamp('2023/06/17 12:00',
                                                    'yyyy/mm/dd hh24:mi')) t
                         Where sample_time >
                               to_timestamp('2023-06-17 02.02.08',
                                            'yyyy-mm-dd hh24:mi:ss')
                           And sample_time <
                               to_timestamp('2023-06-17 22.32.31',
                                            'yyyy-mm-dd hh24:mi:ss')
                         Start With blocking_session Is Not Null
                        Connect By nocycle
                         Prior sample_time = sample_time
                                  /*and ((prior sample_time) - sample_time between interval '-1'
                                  second and interval '1' second)*/
                               And Prior blocking_inst_id = instance_number
                               And Prior blocking_session = session_id
                               And Prior
                                    blocking_session_serial# = session_serial#) t
                 Where t.isleaf = 1) t
         Where r1 = 1) t
 Where r < 5
 Order By sample_time, r;
  1. 整体一个时间段的等待链
With ash As
 (Select /*+ materialize*/
   *
    From (select *
            from dba_hist_active_sess_history
           where sample_time between
                 to_timestamp('2023/06/17 09:00', 'yyyy/mm/dd hh24:mi') and
                 to_timestamp('2023/06/17 12:00', 'yyyy/mm/dd hh24:mi'))
   Where sample_time >
         to_timestamp('2023-06-17 02:02:00', 'yyyy-mm-dd hh24:mi:ss')
     And sample_time <
         to_timestamp('2023-07-22 22:35:00', 'yyyy-mm-dd hh24:mi:ss')),
chains As
 (Select instance_number,
         session_id,
         Level lvl,
         sys_connect_by_path('inst ' || instance_number || ' ' || sql_id || ' ' ||
                             event,
                             ' -> ') path,
         connect_by_isleaf isleaf
    From ash
   Start With event = 'log file sync'
  Connect By nocycle(Prior blocking_session = session_id
                 And Prior blocking_session_serial# = session_serial#
                 And Prior blocking_inst_id = instance_number
                 And Prior sample_id = sample_id
                     /*And ((Prior sample_time) - sample_time Between Interval '-1'
                     Second And Interval '1' Second)*/))
Select instance_number,
       lpad(round(ratio_to_report(Count(*)) over() * 100) || '%', 5, ' ') "%This",
       Count(*) samples,
       path
  From chains
 Where isleaf = 1
 Group By instance_number, path
 Order By samples desc;
  • 举例
    基于ash查询enq: TX - row lock contention(类似队列锁性质的都可以用这个方法)
  1. 查看snap_id
select snap_id,
       to_char(begin_interval_time, 'yyyymmdd hh24:mi:ss') begin_interval_time
  from wrm$_snapshot
 where begin_interval_time between to_date('20230601 19', 'yyyymmdd hh24') and
       to_date('20230630 20', 'yyyymmdd hh24')
   and instance_number = 1
 order by 2;
  1. 根据snap_id, 结果按时间排序
select to_char(sample_time, 'yyyymmdd hh24:mi:ss'), event, count(*)
  from dba_hist_active_sess_history
 where snap_id = 188
   and wait_class <> 'Idle'
   and event = 'enq: TX - row lock contention'
   and instance_number = 2
 group by to_char(sample_time, 'yyyymmdd hh24:mi:ss'), event
 order by 1;
  1. 代入上一步的时间点,查询blocking_session并按count排序
select blocking_session, count(*), event
  from dba_hist_active_sess_history
 where snap_id = 188
   and to_char(sample_time, 'yyyymmdd hh24:mi:ss') = '20230601 12:00:06'
   and wait_class <> 'Idle'
   and event = 'enq: TX - row lock contention'
 group by blocking_session, event
 order by 2;
  1. 代入查到的sid,看在snap范围的blocking_session经历的event
select session_id,
       user_id,
       sql_id,
       blocking_session,
       event,
       to_char(sample_time, 'yyyymmdd hh24:mi:ss')
  from dba_hist_active_sess_history
 where snap_id = 188
   and to_char(sample_time, 'yyyymmdd hh24:mi:ss') <= '20230601 12:00:06'
   and to_char(sample_time, 'yyyymmdd hh24:mi:ss') >= '20230630 11:00:26'
   and wait_class <> 'Idle'
   and instance_number = 1
   and session_id = 2622;
  1. 发生阻塞的主要SQL
with tmp as
 (select /*+ materialize */
  distinct sql_id sqlid
    from dba_hist_active_sess_history
   where snap_id = 188
     and to_char(sample_time, 'yyyymmdd hh24:mi:ss') <= '20230601 08:43:45'
     and to_char(sample_time, 'yyyymmdd hh24:mi:ss') >= '20230630 08:04:39'
     and wait_class <> 'Idle'
     and instance_number = 1
     and event = 'enq: TX - row lock contention')
select distinct parsing_schema_name, sql_id, sql_text
  from v$sql
 where sql_id in (select sqlid from tmp);


ash 案例一:

问题概述

数据库行锁未提交导致大量会话阻塞,手动kill会话之后,恢复正常。

问题分析

从ash裸数据分析,可以发现从10:11开始到11:19会话开始逐渐积压

INST_ID TO_CHAR(SAMPLE_TIME,'YYYYMMDD-HH24MI')    COUNT(*)
---------- --------------------------------------- ----------
         2 20240319-1000                                   14
         2 20240319-1001                                   26
         2 20240319-1002                                   21
         2 20240319-1003                                   36
         2 20240319-1004                                   28
         2 20240319-1005                                   14
         2 20240319-1006                                   13
         2 20240319-1007                                   32
         2 20240319-1008                                   39
         2 20240319-1009                                   60
         2 20240319-1010                                   98
         2 20240319-1011                                  141
         2 20240319-1012                                  206
         2 20240319-1013                                  123
         2 20240319-1014                                  121
         2 20240319-1015                                   95
         2 20240319-1016                                  194
         2 20240319-1017                                  224
         2 20240319-1018                                  255
         2 20240319-1019                                  352
         2 20240319-1020                                  352
         2 20240319-1021                                  372
         2 20240319-1022                                  382
         2 20240319-1023                                  424
         2 20240319-1024                                  454
         2 20240319-1025                                  431
         2 20240319-1026                                  433
         2 20240319-1027                                  426
         2 20240319-1028                                  427
…………
         2 20240319-1048                                  972
         2 20240319-1049                                  980
         2 20240319-1050                                  967
         2 20240319-1051                                  995
         2 20240319-1052                                 1012
         2 20240319-1053                                 1034
         2 20240319-1054                                 1036
         2 20240319-1055                                 1027
         2 20240319-1056                                 1040
         2 20240319-1057                                 1025
         2 20240319-1058                                 1033
         2 20240319-1059                                 1009
         2 20240319-1100                                 1032
         2 20240319-1101                                 1047
         2 20240319-1102                                 1026
         2 20240319-1103                                 1038
         2 20240319-1104                                 1033
         2 20240319-1105                                 1029
         2 20240319-1106                                 1055
         2 20240319-1107                                 1034
         2 20240319-1108                                 1063
         2 20240319-1109                                 1096
         2 20240319-1110                                 1179
         2 20240319-1111                                 1163
         2 20240319-1112                                 1136
         2 20240319-1113                                 1111
         2 20240319-1114                                 1089
         2 20240319-1115                                 1087
         2 20240319-1116                                 1089
         2 20240319-1117                                 1088
         2 20240319-1118                                 1092
         2 20240319-1119                                 1095
…………
111 rows selected.

由于故障时间内发生在2节点,所以只获取2节点的ash信息,数据库节点2会话从10.11分左右开始激增,11.20左右恢复正常。

查询异常时间段内的等待事件

Enter Search before hours (i.e. 2012-01-01 23:00:00) : 2024-03-19 11:11:00
Enter Search jiange hours (i.e. 123| default 1)) : 1

EVENT                                                   TOTAL WAIT_CLASS
-------------------------------------------------- ---------- --------------------
enq: TX - row lock contention                           19173 Application
CPU                                                      4003 CPU
Disk file operations I/O                                 1139 User_IO
enq: TM - contention                                     1127 Application
BCPU                                                      782 BCPU
row cache lock                                             60 Concurrency
log file parallel write                                    57 System_IO
RMAN backup & recovery I/O                                 54 System_IO
log file sync                                              38 Commit
DLM cross inst call completion                             33 Other
KJC: Wait for msg sends to complete                        27 Other
PGA memory operation                                       25 Other
reliable message                                           23 Other
oracle thread bootstrap                                    22 Other
control file sequential read                               20 System_IO
JS kgl get object wait                                     19 Administrative
gc current grant busy                                      17 Cluster
process diagnostic dump                                    15 Other
ASM file metadata operation                                15 Ot``` language
```her
gc current block busy                                      15 Cluster
IMR slave acknowledgement msg                              14 Other
library cache lock                                         12 Concurrency
os thread creation                                         12 Other
rdbms ipc message                                          11 Idle
db file parallel write                                      7 System_IO
row cache mutex                                             6 Concurrency

异常时间段内的等待事件主要为enq: TX - row lock contention,该等待事件通常与表行锁相关。

从ash裸数据查看tx锁阻塞链信息,可以发现在不同时间段内均是由1981会话导致;
查询故障时间段内阻塞链信息

SESSION_ID SQL_ID          SAMPLE_TIME          CHAIN                                    FINAL_PROGRAM                            FINAL_BLK FINAL_EVT                      EVENT
---------- --------------- -------------------- ---------------------------------------- ---------------------------------------- --------- ------------------------------ ----------------------------------------
LVL2_EVENT
------------------------------
       248 93xzn8vuqqfut   20240319 102641      ->1981->5->248                           oracle@vwatdlorap08 (J000)                    1981 ON CPU                         enq: TX - row lock contention


       389 93xzn8vuqqfut   20240319 102641      ->1981->5->389                           oracle@vwatdlorap08 (J000)                    1981 ON CPU                         enq: TX - row lock contention


      1842 5d6xw0m7hman7   20240319 102641      ->1981->5->1842                          oracle@vwatdlorap08 (J000)                    1981 ON CPU                         enq: TX - row lock contention


      1848 26xd7u55hmn31   20240319 102641      ->1981->5->1848                          oracle@vwatdlorap08 (J000)                    1981 ON CPU                         enq: TX - row lock contention


      2670 bgy4dk52kb5bx   20240319 102641      ->1981->5->2670                          oracle@vwatdlorap08 (J000)                    1981 ON CPU                         enq: TX - row lock contention


      2747 gcvd0vn606km4   20240319 102641      ->1981->5->2747                          oracle@vwatdlorap08 (J000)                    1981 ON CPU                         enq: TX - row lock contention


      1614                 20240319 102641      ->1614                                   oracle@vwatdlorap09 (MZ00)                    1614 ON CPU


      1981 dm9pz48khh8c5   20240319 102642      ->1981                                   oracle@vwatdlorap08 (J000)                    1981 ON CPU


         5 0magwj2bd6vf5   20240319 102642      ->1981->5                                oracle@vwatdlorap08 (J000)                    1981 ON CPU                         enq: TX - row lock contention
enq: TX - row lock contention

       248 93xzn8vuqqfut   20240319 102642      ->1981->5->248                           oracle@vwatdlorap08 (J000)                    1981 ON CPU                         enq: TX - row lock contention


       389 93xzn8vuqqfut   20240319 102642      ->1981->5->389                           oracle@vwatdlorap08 (J000)                    1981 ON CPU                         enq: TX - row lock contention


      1842 5d6xw0m7hman7   20240319 102642      ->1981->5->1842                          oracle@vwatdlorap08 (J000)                    1981 ON CPU                         enq: TX - row lock contention


      1848 26xd7u55hmn31   20240319 102642      ->1981->5->1848                          oracle@vwatdlorap08 (J000)                    1981 ON CPU                         enq: TX - row lock contention


      2670 bgy4dk52kb5bx   20240319 102642      ->1981->5->2670                          oracle@vwatdlorap08 (J000)                    1981 ON CPU                         enq: TX - row lock contention


      2747 gcvd0vn606km4   20240319 102642      ->1981->5->2747                          oracle@vwatdlorap08 (J000)                    1981 ON CPU                         enq: TX - row lock contention
…………
SESSION_ID SQL_ID          SAMPLE_TIME          CHAIN                                    FINAL_PROGRAM                            FINAL_BLK FINAL_EVT                      EVENT
---------- --------------- -------------------- ---------------------------------------- ---------------------------------------- --------- ------------------------------ ------------------------------
LVL2_EVENT
------------------------------
      1611 fyd88tk53smq8   20240319 105932      ->1981->5->1611                          oracle@vwatdlorap08 (J000)                    1981 ON CPU                         enq: TX - row lock contention


      1842 5d6xw0m7hman7   20240319 105932      ->1981->5->1842                          oracle@vwatdlorap08 (J000)                    1981 ON CPU                         enq: TX - row lock contention


      1848 26xd7u55hmn31   20240319 105932      ->1981->5->1848                          oracle@vwatdlorap08 (J000)                    1981 ON CPU                         enq: TX - row lock contention



SESSION_ID SQL_ID          SAMPLE_TIME          CHAIN                                    FINAL_PROGRAM                            FINAL_BLK FINAL_EVT                      EVENT
---------- --------------- -------------------- ---------------------------------------- ---------------------------------------- --------- ------------------------------ ------------------------------
LVL2_EVENT
------------------------------
      2369 93xzn8vuqqfut   20240319 105932      ->1981->5->2369                          oracle@vwatdlorap08 (J000)                    1981 ON CPU                         enq: TX - row lock contention


      2531 93xzn8vuqqfut   20240319 105932      ->1981->5->2531                          oracle@vwatdlorap08 (J000)                    1981 ON CPU                         enq: TX - row lock contention


      2670 bgy4dk52kb5bx   20240319 105932      ->1981->5->2670                          oracle@vwatdlorap08 (J000)                    1981 ON CPU                         enq: TX - row lock contention

查看对应故障时间段内的采样时间点的阻塞链表信息,截取部分时间段的信息。
查看对应被阻塞会话信息,可以看到从10:11开始,1981会话已经开始影响update操作无法正常提交

TIME                 SESSION_ID SESSION_SERIAL# SQL_OPNAME    USER_ID SQL_ID          SQL_EXEC_START      EVENT                                    TIME_WAITED CURRENT_OBJ# CURRENT_ROW# BLOCKING_SESSION     SERIAL BLOCKING_INST_ID
-------------------- ---------- --------------- ---------- ---------- --------------- ------------------- ---------------------------------------- ----------- ------------ ------------ ---------------- ---------- ----------------
20240319-101116               5           28227 UPDATE            117 0magwj2bd6vf5   2024-03-19 10:11:04 enq: TX - row lock contention                      0        90535           38             1981      51726                1
20240319-101117               5           28227 UPDATE            117 0magwj2bd6vf5   2024-03-19 10:11:04 enq: TX - row lock contention                      0        90535           38             1981      51726                1
20240319-101118               5           28227 UPDATE            117 0magwj2bd6vf5   2024-03-19 10:11:04 enq: TX - row lock contention                      0        90535           38             1981      51726                1
20240319-101119               5           28227 UPDATE            117 0magwj2bd6vf5   2024-03-19 10:11:04 enq: TX - row lock contention                      0        90535           38             1981      51726                1
20240319-101120               5           28227 UPDATE            117 0magwj2bd6vf5   2024-03-19 10:11:04 enq: TX - row lock contention                      0        90535           38             1981      51726                1
20240319-101121               5           28227 UPDATE            117 0magwj2bd6vf5   2024-03-19 10:11:04 enq: TX - row lock contention                      0        90535           38             1981      51726                1
20240319-101122               5           28227 UPDATE            117 0magwj2bd6vf5   2024-03-19 10:11:04 enq: TX - row lock contention                      0        90535           38             1981      51726                1
20240319-101123               5           28227 UPDATE            117 0magwj2bd6vf5   2024-03-19 10:11:04 enq: TX - row lock contention                      0        90535           38             1981      51726                1
20240319-101124               5           28227 UPDATE            117 0magwj2bd6vf5   2024-03-19 10:11:04 enq: TX - row lock contention                      0        90535           38             1981      51726                1
20240319-101125               5           28227 UPDATE            117 0magwj2bd6vf5   2024-03-19 10:11:04 enq: TX - row lock contention                      0        90535           38             1981      51726                1
20240319-101126               5           28227 UPDATE            117 0magwj2bd6vf5   2024-03-19 10:11:04 enq: TX - row lock contention                      0        90535           38             1981      51726                1
.
.
IME                 SESSION_ID SESSION_SERIAL# SQL_OPNAME    USER_ID SQL_ID          SQL_EXEC_START      EVENT                                    TIME_WAITED CURRENT_OBJ# CURRENT_ROW# BLOCKING_SESSION     SERIAL BLOCKING_INST_ID
-------------------- ---------- --------------- ---------- ---------- --------------- ------------------- ---------------------------------------- ----------- ------------ ------------ ---------------- ---------- ----------------
20240319-111955               5           28227 UPDATE            117 0magwj2bd6vf5   2024-03-19 10:11:04 enq: TX - row lock contention                      0        90535           38             1981      51726                1
20240319-111956               5           28227 UPDATE            117 0magwj2bd6vf5   2024-03-19 10:11:04 enq: TX - row lock contention                      0        90535           38             1981      51726                1
20240319-111957               5           28227 UPDATE            117 0magwj2bd6vf5   2024-03-19 10:11:04 enq: TX - row lock contention                      0        90535           38             1981      51726                1
20240319-111958               5           28227 UPDATE            117 0magwj2bd6vf5   2024-03-19 10:11:04 enq: TX - row lock contention                      0        90535           38             1981      51726                1
20240319-111959               5           28227 UPDATE            117 0magwj2bd6vf5   2024-03-19 10:11:04 enq: TX - row lock contention                      0        90535           38             1981      51726                1

从ash中查询1981会话的相关信息,发现该会话是由job调用,且正在执行递归sql,其中可以发现TOP_LEVEL_SQL_ID 均为0gsgppzm9tgr8。

TIME            SESSION_ID SESSION_SERIAL# PROGRAM                        EVENT                             USER_ID SQL_ID          TOP_LEVEL_SQL_I SQL_EXEC_START      TIME_WAITED CURRENT_OBJ# CURRENT_ROW# BLOCKING_SESSION
--------------- ---------- --------------- ------------------------------ ------------------------------ ---------- --------------- --------------- ------------------- ----------- ------------ ------------ ----------------
    SERIAL BLOCKING_INST_ID
---------- ----------------
20240319-0930         1981           51726 oracle@test (J000)     latch: cache buffers chains             0 gu9y4xpnyra1n   0gsgppzm9tgr8   2024-03-19 09:29:59        8703        91404            0             1608
     11652                1

20240319-0942         1981           51726 oracle@ (J000)     gc cr block 2-way                       0 dm9pz48khh8c5   0gsgppzm9tgr8   2024-03-19 09:42:55         354        96678            0

通过查看视图v$sql信息,可以发现查询0gsgppzm9tgr8其实是在调用对象JD_OEE_CL_MIS,并且2024-03-19 09:22:56发起。

select sql_id, sql_text from v$sql where sql_id = '0gsgppzm9tgr8';

SQL_ID
---------------
SQL_TEXT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
0gsgppzm9tgr8
DECLARE job BINARY_INTEGER := :job;  next_date TIMESTAMP WITH TIME ZONE := :mydate;  broken BOOLEAN := FALSE;  job_name VARCHAR2(128) := :job_name;  job_subname VARCHAR2(128) := :job_subname;  job_owner VARCHAR2(128) := :job_owner
;  job_start TIMESTAMP WITH TIME ZONE := :job_start;  job_scheduled_start TIMESTAMP WITH TIME ZONE := :job_scheduled_start;  window_start TIMESTAMP WITH TIME ZONE := :window_start;  window_end TIMESTAMP WITH TIME ZONE := :window_e
nd;  chain_id VARCHAR2(14) :=  :chainid;  credential_owner VARCHAR2(128) := :credown;  credential_name  VARCHAR2(128) := :crednam;  destination_owner VARCHAR2(128) := :destown;  destination_name VARCHAR2(128) := :destnam;  job_des
t_id varchar2(14) := :jdestid;  log_id number := :log_id;  BEGIN  JD_OEE_CL_MIS;  :mydate := next_date; IF broken THEN :b := 1; ELSE :b := 0; END IF; END;

 SELECT to_char(sample_time,'yyyymmdd-hh24mi') time,
  2         SESSION_ID,
       SESSION_SERIAL#,
  3    4         SQL_OPNAME,
  5         event,
       USER_ID,
       SQL_ID,TOP_LEVEL_SQL_ID,
       SQL_EXEC_START,
  6    7    8    9         TIME_WAITED,
       CURRENT_OBJ#,
       CURRENT_ROW#,
       BLOCKING_SESSION,
 10   11   12   13         BLOCKING_SESSION_SERIAL# serial, BLOCKING_INST_ID
FROM gv$active_session_history
WHERE sql_id = '0gsgppzm9tgr8'
 14   15   16  ORDER BY 1,2,8;

TIME                 SESSION_ID SESSION_SERIAL# SQL_OPNAME EVENT                                       USER_ID SQL_ID          TOP_LEVEL_SQL_ID                        SQL_EXEC_START      TIME_WAITED CURRENT_OBJ# CURRENT_ROW#
-------------------- ---------- --------------- ---------- ---------------------------------------- ---------- --------------- --------------------------------------- ------------------- ----------- ------------ ------------
BLOCKING_SESSION     SERIAL BLOCKING_INST_ID
---------------- ---------- ----------------
20240319-0927              1981           51726 PL/SQL EXE                                                   0 0gsgppzm9tgr8   0gsgppzm9tgr8                           2024-03-19 09:22:56           0        91404            0
                                                CUTE

解决方案

手工提交,或者kill该会话,核查存储过程及程序代码逻辑,保证数据提交顺序,避免不必要的行锁等待。


ash 案例二:

经过ash及osw信息,在08.02.28-08.03.28这一分钟内,出现了大量log file sync等待事件,查看osw主机资源可以发现该时间段内,io出现了大量读写延迟,建议核查链路以及磁盘是否正常。

1、查看ash裸数据发现,在08:02-08:03内出现了大量log file sycn等待事件,enq: TX - row lock contention等待事件从ash采样时间段内没有出现明显积压;

> sql 
SELECT inst_id,
       to_char(sample_time,'yyyymmdd-hh24mi'),
       count(*),
       event
FROM ASH_20240322
WHERE sample_time >=to_date('20240322-0720','yyyymmdd-hh24mi')
  AND sample_time <=to_date('20240322-0830','yyyymmdd-hh24mi')
  and inst_id = 1
  and event in ('enq: TX - row lock contention','log file sync')
GROUP BY inst_id,
         to_char(sample_time,'yyyymmdd-hh24mi'),
         event HAVING count(*) > 2
ORDER BY 4, 2;

   INST_ID TO_CHAR(SAMPLE_TIME,'YYYYMMDD-HH24MI')    COUNT(*) EVENT
---------- --------------------------------------- ---------- ------------------------------
         1 20240322-0730                                  422 enq: TX - row lock contention
         1 20240322-0731                                  396 enq: TX - row lock contention
         1 20240322-0732                                  366 enq: TX - row lock contention
         1 20240322-0733                                  352 enq: TX - row lock contention
         1 20240322-0734                                  444 enq: TX - row lock contention
         1 20240322-0735                                  459 enq: TX - row lock contention
         1 20240322-0736                                  391 enq: TX - row lock contention
         1 20240322-0737                                  376 enq: TX - row lock contention
         1 20240322-0738                                   55 enq: TX - row lock contention
         1 20240322-0739                                  319 enq: TX - row lock contention
         1 20240322-0740                                  205 enq: TX - row lock contention
         1 20240322-0741                                   52 enq: TX - row lock contention
         1 20240322-0742                                  128 enq: TX - row lock contention
         1 20240322-0743                                   60 enq: TX - row lock contention
         1 20240322-0744                                  185 enq: TX - row lock contention
         1 20240322-0745                                  312 enq: TX - row lock contention
         1 20240322-0746                                  413 enq: TX - row lock contention
         1 20240322-0747                                  413 enq: TX - row lock contention
         1 20240322-0748                                  372 enq: TX - row lock contention
         1 20240322-0749                                  278 enq: TX - row lock contention
         1 20240322-0750                                  222 enq: TX - row lock contention
         1 20240322-0751                                  324 enq: TX - row lock contention
         1 20240322-0752                                   54 enq: TX - row lock contention
         1 20240322-0753                                  347 enq: TX - row lock contention
         1 20240322-0754                                   68 enq: TX - row lock contention
         1 20240322-0755                                  325 enq: TX - row lock contention
         1 20240322-0756                                   67 enq: TX - row lock contention
         1 20240322-0757                                  279 enq: TX - row lock contention
         1 20240322-0758                                  438 enq: TX - row lock contention
         1 20240322-0759                                  436 enq: TX - row lock contention
         1 20240322-0800                                  425 enq: TX - row lock contention
         1 20240322-0801                                  438 enq: TX - row lock contention
         1 20240322-0802                                  311 enq: TX - row lock contention
         1 20240322-0803                                  579 enq: TX - row lock contention
         1 20240322-0804                                  370 enq: TX - row lock contention
         1 20240322-0805                                  404 enq: TX - row lock contention
         1 20240322-0806                                  444 enq: TX - row lock contention
         1 20240322-0807                                  466 enq: TX - row lock contention
         1 20240322-0808                                  444 enq: TX - row lock contention
         1 20240322-0809                                  238 enq: TX - row lock contention
         1 20240322-0810                                  404 enq: TX - row lock contention
         1 20240322-0811                                  471 enq: TX - row lock contention
         1 20240322-0812                                  427 enq: TX - row lock contention
         1 20240322-0813                                  371 enq: TX - row lock contention
         1 20240322-0814                                  391 enq: TX - row lock contention
         1 20240322-0815                                  194 enq: TX - row lock contention
         1 20240322-0816                                   56 enq: TX - row lock contention

   INST_ID TO_CHAR(SAMPLE_TIME,'YYYYMMDD-HH24MI')    COUNT(*) EVENT
---------- --------------------------------------- ---------- ------------------------------
         1 20240322-0817                                  346 enq: TX - row lock contention
         1 20240322-0818                                  402 enq: TX - row lock contention
         1 20240322-0819                                  343 enq: TX - row lock contention
         1 20240322-0730                                   12 log file sync
         1 20240322-0731                                   70 log file sync
         1 20240322-0732                                   12 log file sync
         1 20240322-0733                                   15 log file sync
         1 20240322-0734                                   19 log file sync
         1 20240322-0735                                   18 log file sync
         1 20240322-0736                                   15 log file sync
         1 20240322-0737                                   15 log file sync
         1 20240322-0738                                    9 log file sync
         1 20240322-0739                                   15 log file sync
         1 20240322-0740                                   11 log file sync
         1 20240322-0741                                   13 log file sync
         1 20240322-0742                                   16 log file sync
         1 20240322-0743                                   17 log file sync
         1 20240322-0744                                   19 log file sync
         1 20240322-0745                                    8 log file sync
         1 20240322-0746                                   11 log file sync
         1 20240322-0747                                   19 log file sync
         1 20240322-0748                                   14 log file sync
         1 20240322-0749                                   14 log file sync
         1 20240322-0750                                   19 log file sync
         1 20240322-0751                                   14 log file sync
         1 20240322-0752                                   25 log file sync
         1 20240322-0753                                    9 log file sync
         1 20240322-0754                                   11 log file sync
         1 20240322-0755                                   13 log file sync
         1 20240322-0756                                   15 log file sync
         1 20240322-0757                                   15 log file sync
         1 20240322-0758                                   19 log file sync
         1 20240322-0759                                    7 log file sync
         1 20240322-0800                                   17 log file sync
         1 20240322-0801                                   23 log file sync
         1 20240322-0802                                 5155 log file sync
         1 20240322-0803                                 3028 log file sync
         1 20240322-0804                                   12 log file sync
         1 20240322-0805                                    7 log file sync
         1 20240322-0806                                   13 log file sync
         1 20240322-0807                                   14 log file sync
         1 20240322-0808                                    7 log file sync
         1 20240322-0809                                   10 log file sync
         1 20240322-0810                                   12 log file sync
         1 20240322-0811                                    7 log file sync
         1 20240322-0812                                    8 log file sync
         1 20240322-0813                                    6 log file sync
         1 20240322-0814                                   17 log file sync
         1 20240322-0815                                   14 log file sync
         1 20240322-0816                                    9 log file sync
         1 20240322-0817                                   12 log file sync
         1 20240322-0818                                   11 log file sync
         1 20240322-0819                                   18 log file sync

2、截取08:02-08:03故障时间内部分详细等待事件,我们可以发现log file sync均是由于1节点 1765后台lgw进程log file parallel write阻塞;(其中lv代表阻塞层级,isleaf=1表示阻塞源头

> sql
select
 level                     lv,
 connect_by_isleaf         isleaf,
 connect_by_iscycle        iscycle,
 t.sample_time,
 t.inst_id,
 t.session_id,
 t.sql_id,
 t.session_type,
 t.event,
 t.session_state,
 t.blocking_inst_id blind,
 t.blocking_session blsid,
 t.blocking_session_status bss
  from ASH_20240322 t
 where sample_time >
    to_timestamp('2024-03-22 08:01:00',
                 'yyyy-mm-dd hh24:mi:ss')
and sample_time <
    to_timestamp('2024-03-22 08:05:00',
                 'yyyy-mm-dd hh24:mi:ss')
 start with blocking_session is not null
connect by nocycle
  prior sample_time = sample_time
          /*and ((prior sample_time) - sample_time between interval '-1'
          second and interval '1' second)*/
       and prior blocking_inst_id = inst_id
       and prior blocking_session = session_id
       and prior blocking_session_serial# = session_serial#
 order siblings by sample_time;

        LV     ISLEAF    ISCYCLE SAMPLE_TIME                  INST_ID SESSION_ID SQL_ID          SESSION_TYPE    EVENT                          SESSION_STATE    BLIND  BLSID BSS
---------- ---------- ---------- ------------------------- ---------- ---------- --------------- --------------- ------------------------------ --------------- ------ ------ ---------------
         2          1          0 22-MAR-24 08.03.19.818 AM          1       1765                 BACKGROUND      log file parallel write        WAITING                       NO HOLDER
         1          0          0 22-MAR-24 08.03.19.818 AM          1       2113                 FOREGROUND      log file sync                  WAITING              1   1765 VALID
         2          1          0 22-MAR-24 08.03.19.818 AM          1       1765                 BACKGROUND      log file parallel write        WAITING                       NO HOLDER
         1          0          0 22-MAR-24 08.03.19.818 AM          1       2072                 FOREGROUND      log file sync                  WAITING              1   1765 VALID
         2          1          0 22-MAR-24 08.03.19.818 AM          1       1765                 BACKGROUND      log file parallel write        WAITING                       NO HOLDER
         1          0          0 22-MAR-24 08.03.19.818 AM          1       2066                 FOREGROUND      log file sync                  WAITING              1   1765 VALID
         2          1          0 22-MAR-24 08.03.19.818 AM          1       1765                 BACKGROUND      log file parallel write        WAITING                       NO HOLDER
         1          0          0 22-MAR-24 08.03.19.818 AM          1       2033                 FOREGROUND      log file sync                  WAITING              1   1765 VALID
         2          1          0 22-MAR-24 08.03.19.818 AM          1       1765                 BACKGROUND      log file parallel write        WAITING                       NO HOLDER
         1          0          0 22-MAR-24 08.03.19.818 AM          1       2032                 FOREGROUND      log file sync                  WAITING              1   1765 VALID
         2          1          0 22-MAR-24 08.03.19.818 AM          1       1765                 BACKGROUND      log file parallel write        WAITING                       NO HOLDER
         1          0          0 22-MAR-24 08.03.19.818 AM          1       2023                 FOREGROUND      log file sync                  WAITING              1   1765 VALID
         2          1          0 22-MAR-24 08.03.19.818 AM          1       1765                 BACKGROUND      log file parallel write        WAITING                       NO HOLDER
         1          0          0 22-MAR-24 08.03.19.818 AM          1       2022                 FOREGROUND      log file sync                  WAITING              1   1765 VALID
         2          1          0 22-MAR-24 08.03.19.818 AM          1       1765                 BACKGROUND      log file parallel write        WAITING                       NO HOLDER
         1          0          0 22-MAR-24 08.03.19.818 AM          1       1978                 FOREGROUND      log file sync                  WAITING              1   1765 VALID
         2          1          0 22-MAR-24 08.03.19.818 AM          1       1765                 BACKGROUND      log file parallel write        WAITING                       NO HOLDER
         1          0          0 22-MAR-24 08.03.19.818 AM          1       1976                 FOREGROUND      log file sync                  WAITING              1   1765 VALID
         2          1          0 22-MAR-24 08.03.19.818 AM          1       1765                 BACKGROUND      log file parallel write        WAITING                       NO HOLDER
         1          0          0 22-MAR-24 08.03.19.818 AM          1       1950                 FOREGROUND      log file sync                  WAITING              1   1765 VALID
         2          1          0 22-MAR-24 08.03.19.818 AM          1       1765                 BACKGROUND      log file parallel write        WAITING                       NO HOLDER
         1          0          0 22-MAR-24 08.03.19.818 AM          1       1940                 FOREGROUND      log file sync                  WAITING              1   1765 VALID
         2          1          0 22-MAR-24 08.03.19.818 AM          1       1765                 BACKGROUND      log file parallel write        WAITING                       NO HOLDER
         1          0          0 22-MAR-24 08.03.19.818 AM          1       1939                 FOREGROUND      log file sync                  WAITING              1   1765 VALID
         2          1          0 22-MAR-24 08.03.19.818 AM          1       1765                 BACKGROUND      log file parallel write        WAITING                       NO HOLDER
         1          0          0 22-MAR-24 08.03.19.818 AM          1       1936                 FOREGROUND      log file sync                  WAITING              1   1765 VALID
         2          1          0 22-MAR-24 08.03.19.818 AM          1       1765                 BACKGROUND      log file parallel write        WAITING                       NO HOLDER
         1          0          0 22-MAR-24 08.03.19.818 AM          1       1907                 FOREGROUND      log file sync                  WAITING              1   1765 VALID
         2          1          0 22-MAR-24 08.03.19.818 AM          1       1765                 BACKGROUND      log file parallel write        WAITING                       NO HOLDER
         1          0          0 22-MAR-24 08.03.19.818 AM          1       1898                 FOREGROUND      log file sync                  WAITING              1   1765 VALID
         2          1          0 22-MAR-24 08.03.19.818 AM          1       1765                 BACKGROUND      log file parallel write        WAITING                       NO HOLDER
         1          0          0 22-MAR-24 08.03.19.818 AM          1       1897                 FOREGROUND      log file sync                  WAITING              1   1765 VALID
         2          1          0 22-MAR-24 08.03.19.818 AM          1       1765                 BACKGROUND      log file parallel write        WAITING                       NO HOLDER
         1          0          0 22-MAR-24 08.03.19.818 AM          1       1869                 FOREGROUND      log file sync                  WAITING              1   1765 VALID
         2          1          0 22-MAR-24 08.03.19.818 AM          1       1765                 BACKGROUND      log file parallel write        WAITING                       NO HOLDER
         1          0          0 22-MAR-24 08.03.19.818 AM          1       1852                 FOREGROUND      log file sync                  WAITING              1   1765 VALID
         2          1          0 22-MAR-24 08.03.19.818 AM          1       1765                 BACKGROUND      log file parallel write        WAITING                       NO HOLDER
         1          0          0 22-MAR-24 08.03.19.818 AM          1       1808                 FOREGROUND      log file sync                  WAITING              1   1765 VALID
         2          1          0 22-MAR-24 08.03.19.818 AM          1       1765                 BACKGROUND      log file parallel write        WAITING                       NO HOLDER
         1          0          0 22-MAR-24 08.03.19.818 AM          1       1743                 FOREGROUND      log file sync                  WAITING              1   1765 VALID
         2          1          0 22-MAR-24 08.03.19.818 AM          1       1765                 BACKGROUND      log file parallel write        WAITING                       NO HOLDER
         1          0          0 22-MAR-24 08.03.19.818 AM          1       1740                 FOREGROUND      log file sync                  WAITING              1   1765 VALID
         2          1          0 22-MAR-24 08.03.19.818 AM          1       1765                 BACKGROUND      log file parallel write        WAITING                       NO HOLDER
         1          0          0 22-MAR-24 08.03.19.818 AM          1       1726                 FOREGROUND      log file sync                  WAITING              1   1765 VALID
         2          1          0 22-MAR-24 08.03.19.818 AM          1       1765                 BACKGROUND      log file parallel write        WAITING                       NO HOLDER
         1          0          0 22-MAR-24 08.03.19.818 AM          1       1724                 FOREGROUND      log file sync                  WAITING              1   1765 VALID
         2          1          0 22-MAR-24 08.03.19.818 AM          1       1765                 BACKGROUND      log file parallel write        WAITING                       NO HOLDER

3、进一步查看08:02-08:03这段时间阻塞情况,可以发现在阻塞最为严重的时刻,每秒阻塞了200+的会话,阻塞相当严重;

SQL>
select t.lv,
       t.iscycle,
       t.sample_time,
       t.inst_id,
       t.session_id,
       t.sql_id,
       t.session_type,
       t.event,
       t.seq#,
       t.session_state,
       t.blocking_inst_id blind,
       t.blocking_session blsid,
       t.blocking_session_status bss,
       t.c cnt
  from (select t.*,
               row_number() over(partition by inst_id, sample_time order by c desc) r
          from (select t.*,
                       count(*) over(partition by inst_id, sample_time, session_id) c,
                       row_number() over(partition by inst_id, sample_time, session_id order by 1) r1
                  from (select
                         level              lv,
                         connect_by_isleaf  isleaf,
                         connect_by_iscycle iscycle,
                         t.*
                          from ASH_20240322 t
                        where sample_time >
                            to_timestamp('2024-03-22 07:55:00',
                                         'yyyy-mm-dd hh24:mi:ss')
                        and sample_time <
                            to_timestamp('2024-03-22 08:10:00',
                                         'yyyy-mm-dd hh24:mi:ss')
                         start with blocking_session is not null
                        connect by nocycle
                          prior sample_time = sample_time
                                  /*and ((prior sample_time) - sample_time between interval '-1'
                                  second and interval '1' second)*/
                               and prior blocking_inst_id = inst_id
                               and prior blocking_session = session_id
                               and prior
                                    blocking_session_serial# = session_serial#) t
                 where t.isleaf = 1) t
         where r1 = 1) t
 where session_id = 1765 and r < 3
 order by sample_time, r;

        LV    ISCYCLE SAMPLE_TIME                  INST_ID SESSION_ID SQL_ID          SESSION_TYPE    EVENT                                SEQ# SESSION_STATE    BLIND  BLSID BSS                              CNT
---------- ---------- ------------------------- ---------- ---------- --------------- --------------- ------------------------------ ---------- --------------- ------ ------ ------------------------- ----------
         2          0 22-MAR-24 08.02.08.745 AM                  1765                 BACKGROUND      log file parallel write             37197 WAITING                       NO HOLDER                          3
         2          0 22-MAR-24 08.02.20.757 AM                  1765                 BACKGROUND                                           2138 ON CPU                        NOT IN WAIT                        1
         2          0 22-MAR-24 08.02.21.758 AM                  1765                 BACKGROUND      log file parallel write              4338 WAITING                       NO HOLDER                          1
         2          0 22-MAR-24 08.02.28.765 AM                  1765                 BACKGROUND      log file parallel write             19057 WAITING                       NO HOLDER                         10
         2          0 22-MAR-24 08.02.29.766 AM                  1765                 BACKGROUND      log file parallel write             19060 WAITING                       NO HOLDER                         52
         2          0 22-MAR-24 08.02.30.767 AM                  1765                 BACKGROUND      log file parallel write             19064 WAITING                       NO HOLDER                         57
         2          0 22-MAR-24 08.02.31.768 AM          1       1765                 BACKGROUND      log file parallel write             19066 WAITING                       NO HOLDER                         95
         2          0 22-MAR-24 08.02.32.769 AM                  1765                 BACKGROUND      log file parallel write             19069 WAITING                       NO HOLDER                         88
         2          0 22-MAR-24 08.02.33.770 AM                  1765                 BACKGROUND      log file parallel write             19073 WAITING                       NO HOLDER                        135
         2          0 22-MAR-24 08.02.34.771 AM                  1765                 BACKGROUND      log file parallel write             19074 WAITING                       NO HOLDER                        184
         2          0 22-MAR-24 08.02.35.772 AM                  1765                 BACKGROUND      log file parallel write             19075 WAITING                       NO HOLDER                        208
         2          0 22-MAR-24 08.02.36.774 AM                  1765                 BACKGROUND      log file parallel write             19076 WAITING                       NO HOLDER                        197
         2          0 22-MAR-24 08.02.37.775 AM                  1765                 BACKGROUND      log file parallel write             19077 WAITING                       NO HOLDER                        186
         2          0 22-MAR-24 08.02.38.776 AM                  1765                 BACKGROUND      log file parallel write             19080 WAITING                       NO HOLDER                        161
         2          0 22-MAR-24 08.02.39.777 AM                  1765                 BACKGROUND      log file parallel write             19901 WAITING                       NO HOLDER                        190
         2          0 22-MAR-24 08.02.40.778 AM                  1765                 BACKGROUND      log file parallel write             19901 WAITING                       NO HOLDER                        196
         2          0 22-MAR-24 08.02.41.779 AM                  1765                 BACKGROUND      log file parallel write             19902 WAITING                       NO HOLDER                        195
         2          0 22-MAR-24 08.02.42.780 AM                  1765                 BACKGROUND      log file parallel write             19902 WAITING                       NO HOLDER                        198
         2          0 22-MAR-24 08.02.43.781 AM                  1765                 BACKGROUND      log file parallel write             19903 WAITING                       NO HOLDER                        201
         2          0 22-MAR-24 08.02.44.782 AM                  1765                 BACKGROUND      log file parallel write             19903 WAITING                       NO HOLDER                        204
         2          0 22-MAR-24 08.02.45.783 AM                  1765                 BACKGROUND      log file parallel write             19904 WAITING                       NO HOLDER                        202
         2          0 22-MAR-24 08.02.46.784 AM                  1765                 BACKGROUND      log file parallel write             19905 WAITING                       NO HOLDER                        199
         2          0 22-MAR-24 08.02.47.785 AM                  1765                 BACKGROUND      log file parallel write             19907 WAITING                       NO HOLDER                        177
         2          0 22-MAR-24 08.02.48.787 AM                  1765                 BACKGROUND      log file parallel write             19909 WAITING                       NO HOLDER                        166
         2          0 22-MAR-24 08.02.49.788 AM                  1765                 BACKGROUND      log file parallel write             19911 WAITING                       NO HOLDER                        168
         2          0 22-MAR-24 08.02.50.789 AM                  1765                 BACKGROUND      log file parallel write             19913 WAITING                       NO HOLDER                        172
         2          0 22-MAR-24 08.02.51.790 AM                  1765                 BACKGROUND      log file parallel write             19916 WAITING                       NO HOLDER                        160
         2          0 22-MAR-24 08.02.52.791 AM                  1765                 BACKGROUND      log file parallel write             19918 WAITING                       NO HOLDER                         26
         2          0 22-MAR-24 08.02.53.792 AM                  1765                 BACKGROUND      log file parallel write             19920 WAITING                       NO HOLDER                        186
         2          0 22-MAR-24 08.02.54.793 AM                  1765                 BACKGROUND      log file parallel write             19923 WAITING                       NO HOLDER                        188
         2          0 22-MAR-24 08.02.55.794 AM                  1765                 BACKGROUND      log file parallel write             19924 WAITING                       NO HOLDER                        190
         2          0 22-MAR-24 08.02.56.795 AM                  1765                 BACKGROUND      log file parallel write             19927 WAITING                       NO HOLDER                        192
         2          0 22-MAR-24 08.02.57.796 AM                  1765                 BACKGROUND      log file parallel write             19933 WAITING                       NO HOLDER                        202
         2          0 22-MAR-24 08.02.58.797 AM                  1765                 BACKGROUND      log file parallel write             19956 WAITING                       NO HOLDER                        194
         2          0 22-MAR-24 08.02.59.798 AM                  1765                 BACKGROUND      log file parallel write             19960 WAITING                       NO HOLDER                        200
         2          0 22-MAR-24 08.03.00.799 AM                  1765                 BACKGROUND      log file parallel write             19962 WAITING                       NO HOLDER                        201
         2          0 22-MAR-24 08.03.01.800 AM                  1765                 BACKGROUND      log file parallel write             19967 WAITING                       NO HOLDER                        205
         2          0 22-MAR-24 08.03.02.801 AM                  1765                 BACKGROUND      log file parallel write             19968 WAITING                       NO HOLDER                        212
         2          0 22-MAR-24 08.03.03.802 AM                  1765                 BACKGROUND      control file parallel write         19982 WAITING                       UNKNOWN                          209
         2          0 22-MAR-24 08.03.04.803 AM                  1765                 BACKGROUND      log file parallel write             20263 WAITING                       NO HOLDER                        228
         2          0 22-MAR-24 08.03.05.804 AM                  1765                 BACKGROUND      log file parallel write             20263 WAITING                       NO HOLDER                        236
         2          0 22-MAR-24 08.03.06.805 AM                  1765                 BACKGROUND      log file parallel write             20265 WAITING                       NO HOLDER                        212
         2          0 22-MAR-24 08.03.07.806 AM                  1765                 BACKGROUND      log file parallel write             20266 WAITING                       NO HOLDER                        211
         2          0 22-MAR-24 08.03.08.807 AM                  1765                 BACKGROUND      log file parallel write             20267 WAITING                       NO HOLDER                        216
         2          0 22-MAR-24 08.03.09.808 AM                  1765                 BACKGROUND      log file parallel write             20270 WAITING                       NO HOLDER                        192
         2          0 22-MAR-24 08.03.10.809 AM                  1765                 BACKGROUND      log file parallel write             20274 WAITING                       NO HOLDER                        189
         2          0 22-MAR-24 08.03.11.810 AM                  1765                 BACKGROUND      log file parallel write             20278 WAITING                       NO HOLDER                        192
         2          0 22-MAR-24 08.03.12.811 AM                  1765                 BACKGROUND      log file parallel write             20379 WAITING                       NO HOLDER                        174
         2          0 22-MAR-24 08.03.13.812 AM                  1765                 BACKGROUND      log file parallel write             20384 WAITING                       NO HOLDER                        185
         2          0 22-MAR-24 08.03.14.813 AM                  1765                 BACKGROUND      log file parallel write             21577 WAITING                       NO HOLDER                         51
         2          0 22-MAR-24 08.03.15.814 AM                  1765                 BACKGROUND      log file parallel write             21935 WAITING                       NO HOLDER                        180
         2          0 22-MAR-24 08.03.16.815 AM                  1765                 BACKGROUND      log file parallel write             22109 WAITING                       NO HOLDER                        129
         2          0 22-MAR-24 08.03.17.816 AM                  1765                 BACKGROUND      log file parallel write             22178 WAITING                       NO HOLDER                         11
         2          0 22-MAR-24 08.03.18.817 AM          1       1765                 BACKGROUND      log file parallel write             22434 WAITING                       NO HOLDER                        153
         2          0 22-MAR-24 08.03.19.818 AM                  1765                 BACKGROUND      log file parallel write             22666 WAITING                       NO HOLDER                        135
         2          0 22-MAR-24 08.03.20.819 AM                  1765                 BACKGROUND      log file parallel write             23447 WAITING                       NO HOLDER                          7
         2          0 22-MAR-24 08.03.22.821 AM                  1765                 BACKGROUND      log file parallel write             29680 WAITING                       NO HOLDER                          6
         2          0 22-MAR-24 08.03.23.821 AM                  1765                 BACKGROUND      log file parallel write             30602 WAITING                       NO HOLDER                          6
         2          0 22-MAR-24 08.03.24.822 AM                  1765                 BACKGROUND      log file parallel write             31405 WAITING                       NO HOLDER                         16
         2          0 22-MAR-24 08.03.28.826 AM                  1765                 BACKGROUND      log file parallel write             36412 WAITING                       NO HOLDER                         34
         2          0 22-MAR-24 08.03.48.845 AM                  1765                 BACKGROUND      log file parallel write             12062 WAITING                       NO HOLDER                          1
         2          0 22-MAR-24 08.04.08.865 AM                  1765                 BACKGROUND      log file parallel write             51481 WAITING                       NO HOLDER                          1

4、接着我们分析osw信息,查看主机io资源,可以发现在故障时间内dm共享磁盘出现了大量读写延迟,部分磁盘甚至达到了磁盘性能上线,其他时间内均未发现;建议排查io链路及磁盘情况;

zzz ***Fri Mar 22 08:02:24 CST 2024
Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
dm-0              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-1              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-2              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-3              0.00     0.00    1.98    1.98     0.99   127.23    64.75     0.00    0.25    0.00    0.50   0.25   0.10
dm-4              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-5              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-6              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-7              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-8              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-9              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-10             0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-11             0.00   260.40    1.98  568.32    31.68  9703.96    34.14     0.13    0.23    0.00    0.23   0.23  13.27
dm-12             0.00     0.00    1.98    0.99     0.99     0.50     1.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-13             0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-14             0.00     0.00    1.98    1.98     0.99   127.23    64.75     0.00    0.25    0.50    0.00   0.25   0.10
dm-15             0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-16             0.00   260.40    0.00  568.32     0.00  9703.96    34.15     0.14    0.25    0.00    0.25   0.25  14.16
dm-17             0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-18             0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-19             0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-20             0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-21             0.00     0.00    0.00    0.99     0.00     3.96     8.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-22             0.00     0.00    0.00    3.96     0.00    15.84     8.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-23             0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-24             0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-25             0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-26             0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-27             0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-28             0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-29             0.00     0.00    0.00    0.99     0.00    31.68    64.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-30             0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-31             0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-32             0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-33             0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-34             0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
zzz ***Fri Mar 22 08:02:29 CST 2024
Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
dm-0              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-1              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-2              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-3              0.00     0.00    2.00    2.00     1.00   148.50    74.75     0.12   63.50    0.00  127.00  30.00  12.00
dm-4              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-5              0.00     0.00    1.00    2.00     8.00   136.00    96.00     0.28   93.00   24.00  127.50  93.00  27.90
dm-6              0.00     0.00    0.00    4.00     0.00   240.00   120.00     0.42  106.00    0.00  106.00 106.00  42.40
dm-7              0.00     0.00    0.00    5.00     0.00   280.00   112.00     0.28  103.40    0.00  103.40  28.80  14.40
dm-8              0.00     0.00    0.00    7.00     0.00   656.00   187.43     0.38    0.29    0.00    0.29  53.86  37.70
dm-9              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-10             0.00     0.00    2.00    5.00     8.50   280.00    82.43     0.39   66.43   11.00   88.60  45.00  31.50
dm-11             0.00    13.00    3.00    4.00    80.00  2909.00   854.00     1.03  165.57   13.00  280.00 142.86 100.00
dm-12             0.00     0.00    2.00    2.00     1.00     4.50     2.75     0.21   64.75    0.50  129.00  53.50  21.40
dm-13             0.00     0.00    0.00   14.00     0.00   632.00    90.29     0.24   17.21    0.00   17.21  17.14  24.00
dm-14             0.00     0.00    2.00    1.00     1.00     0.50     1.00     0.15   51.00    0.00  153.00  51.00  15.30
dm-15             0.00     0.00    3.00    5.00    24.00   404.00   107.00     0.43   63.00   89.00   47.40  54.12  43.30
dm-16             0.00    13.00    0.00    5.00     0.00  3208.50  1283.40     0.26   52.80    0.00   52.80  51.60  25.80
dm-17             0.00     0.00    0.00    7.00     0.00   352.00   100.57     0.77   97.14    0.00   97.14  99.86  69.90
dm-18             0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-19             0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-20             0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-21             0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-22             0.00     0.00    0.00    4.00     0.00    16.00     8.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-23             0.00     0.00    1.00    5.00     8.00   288.00    98.67     0.23   37.83   30.00   39.40  37.67  22.60
dm-24             0.00     0.00    0.00    1.00     0.00   104.00   208.00     0.18  183.00    0.00  183.00 183.00  18.30
dm-25             0.00     0.00    0.00    7.00     0.00   592.00   169.14     0.52   73.57    0.00   73.57  56.29  39.40
dm-26             0.00     0.00    0.00    3.00     0.00    64.00    42.67     0.24    0.00    0.00    0.00  79.67  23.90
dm-27             0.00     0.00    1.00    4.00     8.00    48.00    22.40     0.37   74.40  148.00   56.00  74.20  37.10
dm-28             0.00     0.00    0.00    7.00     0.00   472.00   134.86     0.68  117.14    0.00  117.14  52.71  36.90
dm-29             0.00     0.00    2.00    7.00    16.00   656.00   149.33     0.56   43.33   17.50   50.71  47.89  43.10
dm-30             0.00     0.00    0.00   11.00     0.00   904.00   164.36     1.03   61.91    0.00   61.91  55.27  60.80
dm-31             0.00     0.00    0.00    4.00     0.00   280.00   140.00     0.33    0.00    0.00    0.00  83.00  33.20
dm-32             0.00     0.00    0.00    7.00     0.00   424.00   121.14     0.52   74.14    0.00   74.14  74.00  51.80
dm-33             0.00     0.00    1.00    4.00     8.00   272.00   112.00     0.13   53.20   13.00   63.25  25.60  12.80
dm-34             0.00     0.00    1.00    6.00     8.00   296.00    86.86     0.47   74.29  516.00    0.67  66.14  46.30
…………
zzz ***Fri Mar 22 08:03:09 CST 2024
Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
dm-0              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-1              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-2              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-3              0.00     0.00    4.00    4.00     2.00   168.50    42.62     0.17    4.50    9.00    0.00  16.75  13.40
dm-4              0.00     0.00   19.00   39.00    76.00 38916.00  1344.55     1.27   34.53    0.11   51.31   3.07  17.80
dm-5              0.00     0.00    0.00    8.00     0.00   600.00   150.00     0.53   65.75    0.00   65.75  63.75  51.00
dm-6              0.00     0.00    0.00    4.00     0.00   248.00   124.00     0.27   66.50    0.00   66.50  66.25  26.50
dm-7              0.00     0.00    0.00    6.00     0.00   208.00    69.33     0.64   89.83    0.00   89.83  76.33  45.80
dm-8              0.00     0.00    3.00   14.00    24.00  1048.00   126.12     0.47   27.53    5.00   32.36  15.18  25.80
dm-9              0.00     0.00    0.00   30.00     0.00 28680.00  1912.00     7.77  249.10    0.00  249.10  33.37 100.10
dm-10             0.00     0.00    0.00   11.00     0.00   496.00    90.18     0.35   40.64    0.00   40.64  31.82  35.00
dm-11             0.00     9.00   76.00    4.00 54424.00  1667.50  1402.29     0.70    4.19    1.68   51.75   6.21  49.70
dm-12             0.00     0.00   22.00    3.00    74.00    33.00     8.56     0.00    0.08    0.09    0.00   0.08   0.20
dm-13             0.00     0.00    0.00    8.00     0.00   268.00    67.00     0.68   96.88    0.00   96.88  85.00  68.00
dm-14             0.00     0.00    2.00    4.00     1.00   140.50    47.17     0.70  117.17    0.00  175.75  67.83  40.70
dm-15             0.00     0.00    0.00   13.00     0.00   756.00   116.31     0.03    9.54    0.00    9.54   1.92   2.50
dm-16             0.00     9.00   18.00    5.00    72.00  4063.00   359.57     1.00   44.00    0.11  202.00  43.43  99.90
dm-17             0.00     0.00    0.00    8.00     0.00   424.00   106.00     0.09    0.62    0.00    0.62  11.25   9.00
dm-18             0.00     0.00    0.00   16.00     0.00   401.50    50.19     0.00    0.00    0.00    0.00   0.00   0.00
dm-19             0.00     0.00    0.00    5.00     0.00    52.50    21.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-20             0.00     0.00    0.00    5.00     0.00   292.50   117.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-21             0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-22             0.00     0.00    0.00   42.00     0.00   514.00    24.48     0.00    0.00    0.00    0.00   0.00   0.00
dm-23             0.00     0.00    0.00    9.00     0.00   424.00    94.22     0.23   25.89    0.00   25.89  25.89  23.30
dm-24             0.00     0.00    0.00    5.00     0.00   528.00   211.20     0.66  131.00    0.00  131.00  99.00  49.50
dm-25             0.00     0.00   18.00    7.00    72.00   512.00    46.72     0.00    0.16    0.11    0.29   0.16   0.40
dm-26             0.00     0.00    0.00    7.00     0.00   184.00    52.57     0.42   69.29    0.00   69.29  40.43  28.30
dm-27             0.00     0.00   20.00    9.00    80.50   552.00    43.62     0.02    0.59    0.65    0.44   0.59   1.70
dm-28             0.00     0.00    0.00    9.00     0.00   688.00   152.89     0.86   87.00    0.00   87.00  85.89  77.30
dm-29             0.00     0.00    0.00   13.00     0.00  1088.00   167.38     0.70   49.15    0.00   49.15  32.46  42.20
dm-30             0.00     0.00    0.00    8.00     0.00   896.00   224.00     0.39   48.38    0.00   48.38  48.38  38.70
dm-31             0.00     0.00    0.00    8.00     0.00   592.00   148.00     0.13   25.62    0.00   25.62  16.75  13.40
dm-32             0.00     0.00    0.00   12.00     0.00   528.00    88.00     0.56   24.50    0.00   24.50  31.83  38.20
dm-33             0.00     0.00    0.00    9.00     0.00   200.00    44.44     0.68   75.33    0.00   75.33  75.33  67.80
dm-34             0.00     0.00    0.00    4.00     0.00    56.00    28.00     0.09    0.25    0.00    0.25  23.25   9.30