以下测试环境为Oracle 10g 10.2.0.4版本,测试对Oracle的统计信息的收集与管理。
首先依据dba_objects创建一张测试表:
SQL> create table eygle as select * from dba_objects;
Table created
对该表进行统计信息收集,这里未指定method_opt,则Oracle将采用 FOR ALL COLUMNS SIZE AUTO 选项,这一选项在Oracle 9i中不收集列的柱状图信息,在Oracle 10g中则会根据数据库的选项选择是否收集柱状图,缺省的会为列收集基本信息。
在Oracle 10g中,缺省的初始化参数 _column_tracking_level 设置为1,Oracle会动态收集列的使用情况,如果某些倾斜列被频繁使用,则Oracle会在Auto模式下,自动为该列收集列级柱状图。
看以下测试,执行缺省的表统计信息收集:
SQL> exec dbms_stats.gather_table_stats(user,'EYGLE');
PL/SQL procedure successfully completed
此时数据库为表上的所有字段收集了缺省的统计信息,每个列两个Bucket:
SQL> col column_name for a30
SQL> col owner for a10
SQL> col table_name for a10
SQL> col ENDPOINT_ACTUAL_VALUE for a10
SQL> select * from dba_tab_histograms where table_name='EYGLE' and owner='EYGLE' order by 4;
OWNER TABLE_NAME COLUMN_NAME ENDPOINT_NUMBER ENDPOINT_VALUE ENDPOINT_A
---------- ---------- ------------------------------ --------------- -------------- ----------
EYGLE EYGLE OWNER 0 3.492486153566
EYGLE EYGLE OBJECT_NAME 0 3.388431933833
EYGLE EYGLE SECONDARY 0 4.049991549657
EYGLE EYGLE GENERATED 0 4.049991549657
EYGLE EYGLE TEMPORARY 0 4.049991549657
EYGLE EYGLE STATUS 0 4.478619304731
EYGLE EYGLE TIMESTAMP 0 2.555831764971
EYGLE EYGLE LAST_DDL_TIME 0 2455466.759085
EYGLE EYGLE CREATED 0 2455466.759085
EYGLE EYGLE OBJECT_TYPE 0 3.494321128346
EYGLE EYGLE DATA_OBJECT_ID 0 2
EYGLE EYGLE OBJECT_ID 0 2
EYGLE EYGLE SUBOBJECT_NAME 0 4.163573043437
EYGLE EYGLE OWNER 1 4.532981758140
EYGLE EYGLE GENERATED 1 4.621144204096
EYGLE EYGLE TEMPORARY 1 4.621144204096
EYGLE EYGLE STATUS 1 4.478619304731
EYGLE EYGLE TIMESTAMP 1 2.605922956775
EYGLE EYGLE LAST_DDL_TIME 1 2455492.879583
EYGLE EYGLE SECONDARY 1 4.049991549657
EYGLE EYGLE OBJECT_TYPE 1 4.532166435311
EYGLE EYGLE DATA_OBJECT_ID 1 12918
EYGLE EYGLE OBJECT_ID 1 12918
EYGLE EYGLE SUBOBJECT_NAME 1 4.533986999644
EYGLE EYGLE OBJECT_NAME 1 4.956504674196
EYGLE EYGLE CREATED 1 2455492.879583
26 rows selected
SQL> select * from dba_tab_histograms where table_name='EYGLE' and owner='EYGLE' order by 3;
OWNER TABLE_NAME COLUMN_NAME ENDPOINT_NUMBER ENDPOINT_VALUE ENDPOINT_A
---------- ---------- ------------------------------ --------------- -------------- ----------
EYGLE EYGLE CREATED 0 2455466.759085
EYGLE EYGLE CREATED 1 2455492.879583
EYGLE EYGLE DATA_OBJECT_ID 0 2
EYGLE EYGLE DATA_OBJECT_ID 1 12918
EYGLE EYGLE GENERATED 0 4.049991549657
EYGLE EYGLE GENERATED 1 4.621144204096
EYGLE EYGLE LAST_DDL_TIME 0 2455466.759085
EYGLE EYGLE LAST_DDL_TIME 1 2455492.879583
EYGLE EYGLE OBJECT_ID 1 12918
EYGLE EYGLE OBJECT_ID 0 2
EYGLE EYGLE OBJECT_NAME 0 3.388431933833
EYGLE EYGLE OBJECT_NAME 1 4.956504674196
EYGLE EYGLE OBJECT_TYPE 1 4.532166435311
EYGLE EYGLE OBJECT_TYPE 0 3.494321128346
EYGLE EYGLE OWNER 0 3.492486153566
EYGLE EYGLE OWNER 1 4.532981758140
EYGLE EYGLE SECONDARY 1 4.049991549657
EYGLE EYGLE SECONDARY 0 4.049991549657
EYGLE EYGLE STATUS 0 4.478619304731
EYGLE EYGLE STATUS 1 4.478619304731
OWNER TABLE_NAME COLUMN_NAME ENDPOINT_NUMBER ENDPOINT_VALUE ENDPOINT_A
---------- ---------- ------------------------------ --------------- -------------- ----------
EYGLE EYGLE SUBOBJECT_NAME 1 4.533986999644
EYGLE EYGLE SUBOBJECT_NAME 0 4.163573043437
EYGLE EYGLE TEMPORARY 0 4.049991549657
EYGLE EYGLE TEMPORARY 1 4.621144204096
EYGLE EYGLE TIMESTAMP 0 2.555831764971
EYGLE EYGLE TIMESTAMP 1 2.605922956775
26 rows selected
同时,列的低值、高值等信息会被收集记录在dba_tab_col_statistics中:
SQL> select table_name,column_name,num_distinct,low_value,high_value,DENSITY from dba_tab_col_statistics
2 where table_name='EYGLE';
TABLE_NAME COLUMN_NAME NUM_DISTINCT LOW_VALUE HIGH_VALUE DENSITY
---------- -------------------- ------------ ------------------ -------------------------- ----------
EYGLE OWNER 11 43434943 574D535953 0.09090909
EYGLE OBJECT_NAME 9096 4142425F464A31 5F75746C245F6C6E635F696E64 0.00010993
EYGLE SUBOBJECT_NAME 79 5030 575248245F5741495453545F35 0.01265822
EYGLE OBJECT_ID 11676 C103 C3021E13 8.56457690
EYGLE DATA_OBJECT_ID 2869 C103 C3021E13 0.00034855
EYGLE OBJECT_TYPE 35 434C5553544552 57494E444F572047524F5550 0.02857142
EYGLE CREATED 494 786E091A130E06 786E0A16160725 0.00202429
EYGLE LAST_DDL_TIME 502 786E091A130E06 786E0A16160725 0.00199203
EYGLE TIMESTAMP 515 313939372D30342D31 323031302D31302D32323A3231 0.00194174
EYGLE STATUS 1 56414C4944 56414C4944 1
EYGLE TEMPORARY 2 4E 59 0.5
EYGLE GENERATED 2 4E 59 0.5
EYGLE SECONDARY 1 4E 4E 1
这些基本信息在执行计划生成时会被参考,不能通过for all columns size 1的收集方式删除:
SQL> exec dbms_stats.gather_table_stats(user,'EYGLE',method_opt => 'for all columns size 1');
PL/SQL procedure successfully completed
SQL> select * from dba_tab_histograms where table_name='EYGLE' and owner='EYGLE' order by 3;
OWNER TABLE_NAME COLUMN_NAME ENDPOINT_NUMBER ENDPOINT_VALUE ENDPOINT_A
---------- ---------- -------------------- --------------- -------------- ----------
EYGLE EYGLE CREATED 0 2455466.759085
EYGLE EYGLE CREATED 1 2455492.879583
EYGLE EYGLE DATA_OBJECT_ID 0 2
EYGLE EYGLE DATA_OBJECT_ID 1 12918
EYGLE EYGLE GENERATED 0 4.049991549657
EYGLE EYGLE GENERATED 1 4.621144204096
EYGLE EYGLE LAST_DDL_TIME 0 2455466.759085
EYGLE EYGLE LAST_DDL_TIME 1 2455492.879583
EYGLE EYGLE OBJECT_ID 1 12918
EYGLE EYGLE OBJECT_ID 0 2
EYGLE EYGLE OBJECT_NAME 0 3.388431933833
EYGLE EYGLE OBJECT_NAME 1 4.956504674196
EYGLE EYGLE OBJECT_TYPE 1 4.532166435311
EYGLE EYGLE OBJECT_TYPE 0 3.494321128346
EYGLE EYGLE OWNER 0 3.492486153566
EYGLE EYGLE OWNER 1 4.532981758140
EYGLE EYGLE SECONDARY 1 4.049991549657
EYGLE EYGLE SECONDARY 0 4.049991549657
EYGLE EYGLE STATUS 0 4.478619304731
EYGLE EYGLE STATUS 1 4.478619304731
OWNER TABLE_NAME COLUMN_NAME ENDPOINT_NUMBER ENDPOINT_VALUE ENDPOINT_A
---------- ---------- -------------------- --------------- -------------- ----------
EYGLE EYGLE SUBOBJECT_NAME 1 4.533986999644
EYGLE EYGLE SUBOBJECT_NAME 0 4.163573043437
EYGLE EYGLE TEMPORARY 0 4.049991549657
EYGLE EYGLE TEMPORARY 1 4.621144204096
EYGLE EYGLE TIMESTAMP 0 2.555831764971
EYGLE EYGLE TIMESTAMP 1 2.605922956775
类似如下删除单列统计信息的语句也对基本统计信息无效:
SQL> exec dbms_stats.gather_table_stats(user,'EYGLE',method_opt=>'for columns owner size 1');
PL/SQL procedure successfully completed
但是使用delete_column_stats可以彻底删除列的柱状图信息:
SQL> exec dbms_stats.delete_column_stats(user,'EYGLE','OWNER');
PL/SQL procedure successfully completed
SQL> select * from dba_tab_histograms where table_name='EYGLE' and owner='EYGLE' order by 3;
OWNER TABLE_NAME COLUMN_NAME ENDPOINT_NUMBER ENDPOINT_VALUE ENDPOINT_A
---------- ---------- -------------------- --------------- -------------- ----------
EYGLE EYGLE CREATED 1 2455492.879583
EYGLE EYGLE CREATED 0 2455466.759085
EYGLE EYGLE DATA_OBJECT_ID 0 2
EYGLE EYGLE DATA_OBJECT_ID 1 12918
EYGLE EYGLE GENERATED 0 4.049991549657
EYGLE EYGLE GENERATED 1 4.621144204096
EYGLE EYGLE LAST_DDL_TIME 0 2455466.759085
EYGLE EYGLE LAST_DDL_TIME 1 2455492.879583
EYGLE EYGLE OBJECT_ID 0 2
EYGLE EYGLE OBJECT_ID 1 12918
EYGLE EYGLE OBJECT_NAME 0 3.388431933833
EYGLE EYGLE OBJECT_NAME 1 4.956504674196
EYGLE EYGLE OBJECT_TYPE 1 4.532166435311
EYGLE EYGLE OBJECT_TYPE 0 3.494321128346
EYGLE EYGLE SECONDARY 1 4.049991549657
EYGLE EYGLE SECONDARY 0 4.049991549657
EYGLE EYGLE STATUS 0 4.478619304731
EYGLE EYGLE STATUS 1 4.478619304731
EYGLE EYGLE SUBOBJECT_NAME 0 4.163573043437
EYGLE EYGLE SUBOBJECT_NAME 1 4.533986999644
OWNER TABLE_NAME COLUMN_NAME ENDPOINT_NUMBER ENDPOINT_VALUE ENDPOINT_A
---------- ---------- -------------------- --------------- -------------- ----------
EYGLE EYGLE TEMPORARY 0 4.049991549657
EYGLE EYGLE TEMPORARY 1 4.621144204096
EYGLE EYGLE TIMESTAMP 0 2.555831764971
EYGLE EYGLE TIMESTAMP 1 2.605922956775
24 rows selected
但是通常Oracle不建议删除列上的基本统计信息,因为这些信息在进行执行计划选择时可能极为有用,比如判断某些谓词的取值是否越界等。
看以下操作,先完整收集13列26行统计信息:
SQL> exec dbms_stats.gather_table_stats(user,'EYGLE');
PL/SQL procedure successfully completed
然后在表上执行基于OWNER的查询:
SQL> select count(*) from eygle where owner='SYS';
COUNT(*)
----------
6729
SQL> select count(*) from eygle where owner='EYGLE';
COUNT(*)
----------
240
SQL> select count(*) from eygle where owner='SYSTEM';
COUNT(*)
----------
449
再来进行缺省条件的统计信息收集:
SQL> exec dbms_stats.gather_table_stats(user,'EYGLE');
PL/SQL procedure successfully completed
此时你可能会注意到,OWNER列的柱状图已经被收集:
SQL> select * from dba_tab_histograms where table_name='EYGLE' and owner='EYGLE' order by 3;
OWNER TABLE_NAME COLUMN_NAME ENDPOINT_NUMBER ENDPOINT_VALUE ENDPOINT_A
---------- ---------- -------------------- --------------- -------------- ----------
EYGLE EYGLE CREATED 1 2455492.879583
EYGLE EYGLE CREATED 0 2455466.759085
EYGLE EYGLE DATA_OBJECT_ID 1 12918
EYGLE EYGLE DATA_OBJECT_ID 0 2
EYGLE EYGLE GENERATED 1 4.621144204096
EYGLE EYGLE GENERATED 0 4.049991549657
EYGLE EYGLE LAST_DDL_TIME 0 2455466.759085
EYGLE EYGLE LAST_DDL_TIME 1 2455492.879583
EYGLE EYGLE OBJECT_ID 1 12918
EYGLE EYGLE OBJECT_ID 0 2
EYGLE EYGLE OBJECT_NAME 1 4.956504674196
EYGLE EYGLE OBJECT_NAME 0 3.388431933833
EYGLE EYGLE OBJECT_TYPE 1 4.532166435311
EYGLE EYGLE OBJECT_TYPE 0 3.494321128346
EYGLE EYGLE OWNER 312 4.119221354213
EYGLE EYGLE OWNER 2842 4.171130061672
EYGLE EYGLE OWNER 11679 4.532981758140
EYGLE EYGLE OWNER 9571 4.327723496506
EYGLE EYGLE OWNER 11361 4.327723757311
EYGLE EYGLE OWNER 11364 4.378425024777
EYGLE EYGLE OWNER 10912 4.327723735598
EYGLE EYGLE OWNER 9 3.492486153566
EYGLE EYGLE OWNER 55 3.544214255849
EYGLE EYGLE OWNER 295 3.600792664974
EYGLE EYGLE OWNER 303 4.118597800700
EYGLE EYGLE SECONDARY 0 4.049991549657
EYGLE EYGLE SECONDARY 1 4.049991549657
EYGLE EYGLE STATUS 0 4.478619304731
EYGLE EYGLE STATUS 1 4.478619304731
EYGLE EYGLE SUBOBJECT_NAME 0 4.163573043437
EYGLE EYGLE SUBOBJECT_NAME 1 4.533986999644
EYGLE EYGLE TEMPORARY 1 4.621144204096
EYGLE EYGLE TEMPORARY 0 4.049991549657
EYGLE EYGLE TIMESTAMP 1 2.605922956775
EYGLE EYGLE TIMESTAMP 0 2.555831764971
35 rows selected
这种现象就来源于 _column_tracking_level 的特性监控,如果不希望发生这样的信息收集,则可以调整这个隐含的参数。
此时再使用delete_column_stats就可以删除这些字段的统计信息:
SQL> exec dbms_stats.delete_column_stats(user,'EYGLE','OWNER');
PL/SQL procedure successfully completed
也可以针对这个列使用如下命令清除这个字段的柱状图但是保留基本统计信息:
SQL> exec dbms_stats.gather_table_stats(user,'EYGLE',method_opt=>'for columns owner size 1');
PL/SQL procedure successfully completed
通常推荐使用'for columns owner size 1' 而不是delete_column_stats去清除列的统计信息,完全删除列的基本统计信息在某些Bug的作用下,可能会导致优化器计算的异常。
在统计信息收集时,必须注意到这些选项和后台动作,否则就可能出现和面对很多莫名其妙的问题。