环境:Oracle 11.2.0.3 RAC

问题:统计信息自动收集任务失效原因排查

1.查看自动任务的状态

查看自动任务的状态,确认是enabled状态:

SQL> select client_name,status from dba_autotask_client;

CLIENT_NAME STATUS
---------------------------------------------------------------- --------
auto optimizer stats collection ENABLED
auto space advisor ENABLED
sql tuning advisor ENABLED

Elapsed: 00:01:03.88


一般来说,这样的结果,就意味着统计信息自动收集是打开的,如今却是没有自动收集。说明确实是有故障,需要进一步深入排查。

若对这些自动任务的状态不理解可以参考之前文章:

2.进一步查看其它信息

根据 Why Auto Optimizer Statistics Collection May Appear to be "Stuck" and Not Running? (文档 ID 1320246.1)

排查以下项:

The 'auto optimizer stats collection' task is enabled in auto task
STATISTICS_LEVEL has already been set to TYPICAL or FULL
dba_autotask_client_history is empty
Statistics on tables are stale.
Some scheduler window is active and LAST_START_DATE is several days ago.

具体操作记录如下:

--STATISTICS_LEVEL:
SQL> show parameter STATISTICS_LEVEL

NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
statistics_level string TYPICAL

--dba_autotask_client_history is empty:
SQL> select * from dba_autotask_client_history;

no rows selected

Elapsed: 00:01:04.79

--Some scheduler window is active and LAST_START_DATE 是20-SEP-12 10.00.00.010777 PM
SELECT window_name, last_start_date, enabled, active FROM dba_scheduler_windows;
SQL> SELECT window_name ,
2 last_start_date,
3 enabled ,
4 active
5 FROM dba_scheduler_windows;

WINDOW_NAME LAST_START_DATE ENABL ACTIV
------------------------------ --------------------------------------------------------------------------- ----- -----
MONDAY_WINDOW 22-JAN-18 10.00.00.014299 PM PRC TRUE FALSE
TUESDAY_WINDOW 23-JAN-18 10.00.01.076798 PM PRC TRUE FALSE
WEDNESDAY_WINDOW 24-JAN-18 10.00.00.055066 PM PRC TRUE FALSE
THURSDAY_WINDOW 20-SEP-12 10.00.00.010777 PM PRC TRUE TRUE
FRIDAY_WINDOW 26-JAN-18 10.00.00.725176 PM PRC TRUE FALSE
SATURDAY_WINDOW 27-JAN-18 06.00.00.010751 AM PRC TRUE FALSE
SUNDAY_WINDOW 28-JAN-18 06.00.00.593207 AM PRC TRUE FALSE
WEEKNIGHT_WINDOW FALSE FALSE
WEEKEND_WINDOW FALSE FALSE

9 rows selected.

Elapsed: 00:00:00.01


可以看到,如今的环境确实完全匹配文档列出的这些特征项。

另外,看下统计信息自动收集任务的WINDOW_GROUP以及对应的MEMBER_NAME:

SQL> select client_name,window_group
2 from DBA_AUTOTASK_CLIENT
3 where client_name = 'auto optimizer stats collection';

CLIENT_NAME WINDOW_GROUP
---------------------------------------------------------------- ----------------------------------------------------------------
auto optimizer stats collection ORA$AT_WGRP_OS

Elapsed: 00:01:05.39
SQL> SELECT * FROM dba_scheduler_group_members
2 where group_name='ORA$AT_WGRP_OS';

OWNER GROUP_NAME MEMBER_NAME
------------------------------ ------------------------------ -----------------------------------------------------------------------------------------------------------------------------------
SYS ORA$AT_WGRP_OS "SYS"."MONDAY_WINDOW"
SYS ORA$AT_WGRP_OS "SYS"."TUESDAY_WINDOW"
SYS ORA$AT_WGRP_OS "SYS"."WEDNESDAY_WINDOW"
SYS ORA$AT_WGRP_OS "SYS"."THURSDAY_WINDOW"
SYS ORA$AT_WGRP_OS "SYS"."FRIDAY_WINDOW"
SYS ORA$AT_WGRP_OS "SYS"."SATURDAY_WINDOW"
SYS ORA$AT_WGRP_OS "SYS"."SUNDAY_WINDOW"

7 rows selected.

Elapsed: 00:00:00.04


可以看到,正常应该是一周7天都有的,但是由于THURSDAY_WINDOW 的窗口ACTIVE是TRUE,导致2012年9月27号(周四)统计信息收集失败。根据MOS描述,会影响其后的窗口都打开失败,进而导致之后所有自动任务的执行都失败。

我实际在我的测试环境验证也是如此,现象如下:

SQL> SELECT window_name, last_start_date, enabled, active FROM dba_scheduler_windows;

WINDOW_NAME LAST_START_DATE ENABLED ACTIVE
------------------------------------------------------------ --------------------------------------------------------------------------- ---------- ----------
MONDAY_WINDOW 22-JAN-18 10.00.00.011111 PM EST5EDT TRUE FALSE
TUESDAY_WINDOW 23-JAN-18 10.22.54.068776 PM EST5EDT TRUE FALSE
WEDNESDAY_WINDOW 24-JAN-18 10.00.00.005654 PM EST5EDT TRUE FALSE
THURSDAY_WINDOW 25-JAN-18 10.00.00.009885 PM EST5EDT TRUE FALSE
FRIDAY_WINDOW 26-JAN-18 10.00.00.010077 PM EST5EDT TRUE FALSE
SATURDAY_WINDOW 27-JAN-18 05.00.00.004156 PM EST5EDT TRUE FALSE
SUNDAY_WINDOW 28-JAN-18 01.00.00.010044 PM EST5EDT TRUE TRUE
WEEKNIGHT_WINDOW FALSE FALSE
WEEKEND_WINDOW FALSE FALSE

9 rows selected.

SQL> EXECUTE DBMS_SCHEDULER.OPEN_WINDOW ('MONDAY_WINDOW','');
BEGIN DBMS_SCHEDULER.OPEN_WINDOW ('MONDAY_WINDOW',''); END;

*
ERROR at line 1:
ORA-27480: window "SUNDAY_WINDOW" is currently open
ORA-06512: at "SYS.DBMS_ISCHED", line 493
ORA-06512: at "SYS.DBMS_SCHEDULER", line 1220
ORA-06512: at line 1


SQL>


可以看到我测试环境,因为周日的窗口已经是ACTIVE,再尝试打开周一的窗口就会报错,看起来同时就只能有一个窗口ACTIVE,这与MOS的描述也完全一致。

3.解决问题

根据MOS解决建议手工关闭活动的窗口,我这里就是关闭THURSDAY_WINDOW窗口:

SQL> EXECUTE DBMS_SCHEDULER.CLOSE_WINDOW ('THURSDAY_WINDOW');

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.14


再次查询状态,确认窗口状态已关闭:

SQL> SELECT window_name ,
2 last_start_date,
3 enabled ,
4 active
5 FROM dba_scheduler_windows;

WINDOW_NAME LAST_START_DATE ENABL ACTIV
------------------------------ --------------------------------------------------------------------------- ----- -----
MONDAY_WINDOW 22-JAN-18 10.00.00.014299 PM PRC TRUE FALSE
TUESDAY_WINDOW 23-JAN-18 10.00.01.076798 PM PRC TRUE FALSE
WEDNESDAY_WINDOW 24-JAN-18 10.00.00.055066 PM PRC TRUE FALSE
THURSDAY_WINDOW 20-SEP-12 10.00.00.010777 PM PRC TRUE FALSE
FRIDAY_WINDOW 26-JAN-18 10.00.00.725176 PM PRC TRUE FALSE
SATURDAY_WINDOW 27-JAN-18 06.00.00.010751 AM PRC TRUE FALSE
SUNDAY_WINDOW 28-JAN-18 06.00.00.593207 AM PRC TRUE FALSE
WEEKNIGHT_WINDOW FALSE FALSE
WEEKEND_WINDOW FALSE FALSE

9 rows selected.


这样就找到了问题,另外因为离统计信息信息自动收集关闭已有多年,目前生产环境运行平稳,所以最终决定保守处理,即:先将生产环境的自动任务都关闭,等在备库测试验证后再考虑生产环境开启自动任务。

Reference

- Why Auto Optimizer Statistics Collection May Appear to be "Stuck" and Not Running? (文档 ID 1320246.1)