- 对表进行分析,通常情况下可以对表,索引,列进行单独分析,或者进行组合分析,但这三者哪些是相对重要的,哪些分析显得不那么重要?通过本篇文章的实验相信大家也会对直方图有更一步的了解.
- 1.首先创建测试表,并插入100000条数据
- SQL> create table test(id number,nick varchar2(30));
- Table created.
- SQL> begin
- 2 for i in 1..100000 loop
- 3 insert into test(id) values(i);
- 4 end loop;
- 5 commit;
- 6 end;
- 7 /
- PL/SQL procedure successfully completed.
- 更新nick字段,使数据发生严重倾斜
- SQL> update test set nick='abc' where rownum<99999;
- 99998 rows updated.
- SQL> commit;
- Commit complete.
- SQL> create index idx_test_nick on test(nick);
- Index created.
- SQL> update test set nick='def' where nick is null;
- 2 rows updated.
- SQL> commit;
- Commit complete.
- --只对索引进行分析
- SQL> analyze index idx_test_nick compute statistics;
- Index analyzed.
- SQL> select index_name,LEAF_BLOCKS,DISTINCT_KEYS,NUM_ROWS from user_indexes where index_name='IDX_TEST_NICK';
- INDEX_NAME LEAF_BLOCKS DISTINCT_KEYS NUM_ROWS
- ------------------------------ ----------- ------------- ----------
- IDX_TEST_NICK 210 2 100000
- SQL> select COLUMN_NAME,NUM_BUCKETS,num_distinct from USER_tab_columns where table_name='TEST';
- COLUMN_NAME NUM_BUCKETS NUM_DISTINCT
- ------------------------------ ----------- ------------
- ID
- NICK
- 查看只分析索引的情况下的执行计划,从执行计划可以看出,优化器选择RBO,都走索引
- SQL> set autotrace trace exp
- SQL> select * from test where nick ='abc';
- Execution Plan
- ----------------------------------------------------------
- 0 SELECT STATEMENT Optimizer=CHOOSE
- 1 0 TABLE ACCESS (BY INDEX ROWID) OF 'TEST'
- 2 1 INDEX (RANGE SCAN) OF 'IDX_TEST_NICK' (NON-UNIQUE)
- SQL> select * from test where nick ='def';
- Execution Plan
- ----------------------------------------------------------
- 0 SELECT STATEMENT Optimizer=CHOOSE
- 1 0 TABLE ACCESS (BY INDEX ROWID) OF 'TEST'
- 2 1 INDEX (RANGE SCAN) OF 'IDX_TEST_NICK' (NON-UNIQUE)
- 分析有数据严重倾斜的nick列后,也没有产生正确的执行计划,此时使用的优化器仍然是RBO
- 通过此实验,说明只分析索引和列,ORACLE不会使用CBO的优化器
- SQL> analyze table test compute statistics for columns size 2 nick;
- Table analyzed.
- SQL> select * from test where nick ='abc';
- Execution Plan
- ----------------------------------------------------------
- 0 SELECT STATEMENT Optimizer=CHOOSE
- 1 0 TABLE ACCESS (BY INDEX ROWID) OF 'TEST'
- 2 1 INDEX (RANGE SCAN) OF 'IDX_TEST_NICK' (NON-UNIQUE)
- SQL> select * from test where nick ='def';
- Execution Plan
- ----------------------------------------------------------
- 0 SELECT STATEMENT Optimizer=CHOOSE
- 1 0 TABLE ACCESS (BY INDEX ROWID) OF 'TEST'
- 2 1 INDEX (RANGE SCAN) OF 'IDX_TEST_NICK' (NON-UNIQUE)
- 现在分析表后,产生了正确的执行计划
- SQL> analyze table test compute statistics for table;
- Table analyzed.
- SQL> select * from test where nick ='abc';
- Execution Plan
- ----------------------------------------------------------
- 0 SELECT STATEMENT Optimizer=CHOOSE (Cost=50 Card=99998 Bytes=
- 1499970)
- 1 0 TABLE ACCESS (FULL) OF 'TEST' (Cost=50 Card=99998 Bytes=14
- 99970)
- SQL> select * from test where nick ='def';
- Execution Plan
- ----------------------------------------------------------
- 0 SELECT STATEMENT Optimizer=CHOOSE (Cost=1 Card=2 Bytes=30)
- 1 0 TABLE ACCESS (BY INDEX ROWID) OF 'TEST' (Cost=1 Card=2 Byt
- es=30)
- 2 1 INDEX (RANGE SCAN) OF 'IDX_TEST_NICK' (NON-UNIQUE) (Cost
- =1 Card=2)
- 删除所有的统计数据,并只对表与列进行分析,不分析索引,ORACLE使用CBO的优化器,并产生了正确的执行计划
- SQL> analyze table test delete statistics;
- Table analyzed.
- SQL> analyze table test compute statistics for table for columns size 2 nick;
- Table analyzed.
- SQL> select * from test where nick ='abc';
- Execution Plan
- ----------------------------------------------------------
- 0 SELECT STATEMENT Optimizer=CHOOSE (Cost=50 Card=99998 Bytes=
- 1499970)
- 1 0 TABLE ACCESS (FULL) OF 'TEST' (Cost=50 Card=99998 Bytes=14
- 99970)
- SQL> select * from test where nick ='def';
- Execution Plan
- ----------------------------------------------------------
- 0 SELECT STATEMENT Optimizer=CHOOSE (Cost=1 Card=2 Bytes=30)
- 1 0 TABLE ACCESS (BY INDEX ROWID) OF 'TEST' (Cost=1 Card=2 Byt
- es=30)
- 2 1 INDEX (RANGE SCAN) OF 'IDX_TEST_NICK' (NON-UNIQUE) (Cost
- =1 Card=2)
- 创建TEST表ID列上的索引,但不对索引进行分析
- SQL> create index idx_test_id on test(id);
- Index created.
- SQL> select * from test where id=1;
- Execution Plan
- ----------------------------------------------------------
- 0 SELECT STATEMENT Optimizer=CHOOSE (Cost=1 Card=1000 Bytes=15
- 000)
- 1 0 TABLE ACCESS (BY INDEX ROWID) OF 'TEST' (Cost=1 Card=1000
- Bytes=15000)
- 2 1 INDEX (RANGE SCAN) OF 'IDX_TEST_ID' (NON-UNIQUE) (Cost=1
- Card=400)
- 当条件中即有id,又有nick时,因为nick上有直方图,ORACLE知道nick='abc'的值特别的多,所以不走IDX_TEST_NICK索引,走IDX_TEST_ID上的索引
- SQL> select * from test where id=5 and nick='abc';
- Execution Plan
- ----------------------------------------------------------
- 0 SELECT STATEMENT Optimizer=CHOOSE (Cost=1 Card=1000 Bytes=15
- 000)
- 1 0 TABLE ACCESS (BY INDEX ROWID) OF 'TEST' (Cost=1 Card=1000
- Bytes=15000)
- 2 1 INDEX (RANGE SCAN) OF 'IDX_TEST_ID' (NON-UNIQUE) (Cost=1
- Card=400)
- 当条件中即有id,又有nick时,因为nick上有直方图,ORACLE知道nick='def'的值特别的少,所以走IDX_TEST_NICK上的索引,不走IDX_TEST_ID索引
- SQL> select * from test where id=5 and nick='def';
- Execution Plan
- ----------------------------------------------------------
- 0 SELECT STATEMENT Optimizer=CHOOSE (Cost=1 Card=1 Bytes=15)
- 1 0 TABLE ACCESS (BY INDEX ROWID) OF 'TEST' (Cost=1 Card=1 Byt
- es=15)
- 2 1 INDEX (RANGE SCAN) OF 'IDX_TEST_NICK' (NON-UNIQUE) (Cost
- =1 Card=2)
- 在分析ID列后,ORACLE发现ID列的选择度更高,所以不再选择IDX_TEST_NICK索引,而是选择IDX_TEST_ID
- SQL> analyze table test compute statistics for columns size 1 id;
- Table analyzed.
- SQL> select * from test where id=5 and nick='def';
- Execution Plan
- ----------------------------------------------------------
- 0 SELECT STATEMENT Optimizer=CHOOSE (Cost=1 Card=1 Bytes=7)
- 1 0 TABLE ACCESS (BY INDEX ROWID) OF 'TEST' (Cost=1 Card=1 Byt
- es=7)
- 2 1 INDEX (RANGE SCAN) OF 'IDX_TEST_ID' (NON-UNIQUE) (Cost=1
- Card=1)
- 下面来看另外一种情况,我们删除所有的统计数据,然后在ID列上创建唯一索引,在此条件下,只分析表与分析列nick,我们看到ORACLE走了正确的执行计划,走了UK_TEST_ID,其实从这里也给我们带来很多的启示:在主键与唯一键约束的列上是否需要直方图的问题?如果在这些列上有像这样的查询where id > 100 and id < 1000,我们还是需要有直方图的,但除此之外,好像真的没有直方图的必要了!
- SQL> analyze table test delete statistics;
- Table analyzed.
- SQL> drop index idx_test_id;
- Index dropped.
- SQL> create unique index uk_test_id on test(id);
- Index created.
- SQL> analyze table test compute statistics for table for columns size 2 nick;
- Table analyzed.
- SQL> select * from test where id=5 and nick='def';
- Execution Plan
- ----------------------------------------------------------
- 0 SELECT STATEMENT Optimizer=CHOOSE (Cost=1 Card=1 Bytes=15)
- 1 0 TABLE ACCESS (BY INDEX ROWID) OF 'TEST' (Cost=1 Card=1 Byt
- es=15)
- 2 1 INDEX (UNIQUE SCAN) OF 'UK_TEST_ID' (UNIQUE) (Cost=1 Car
- d=1)
- 从以上一系列的实验可以看出,对ORACLE的优化器CBO来说,表的分析与列的分析才是最重要的,索引的分析次之。还有我们可以考虑我们的哪些列上需要直方图,对于bucket的个数问题,oracle的默认值是75个,所以根据你的应用规则,选择合适的桶数对性能也是有帮助的。因为不必要的桶的个数的大量增加,必然会带来SQL语句硬解析时产生执行计划的复杂度问题。