因为优化器还不够强大,还有很多限制,或者因为一些逻辑原因,分析认为SQL要走索引比较好,但是事实却无法正确利用索引。这时候,除了给ORACLE需要的统计信息之外,写的SQL必须要能够给优化器足够多的额外有效信息,让优化器能够选择更好的执行计划。
要让给优化器正确使用上需要的索引,要考虑两点:
1).如何避免优化器的限制 2).根据业务数据特点改写SQL语句
说明:这里说的走不了索引,是指走不了正常的RANGE SCAN,非(FAST) FULL INDEX SCAN。
1.谓词使用不等于(<>,!=),目前的优化器一般认为不等于的选择性不好,会找到很多行,所以用不上索引,就算强制索引,也是扫描全部的索引,走不了RANGE SCAN
如果事实走索引定位较好,那么解决方法:
1)如果不等条件之外的值不多,而且是确认的,可以改为等值或IN查询,比如status状态字段一般值类别很少
2)如果不等条件之外的值很多,可以改为> OR <的形式,当然第2种方法包含了方法1.
例如:
DROP TABLE t;
CREATE TABLE t AS SELECT LEVEL ID,rpad('x',10,'x') padding,CAST('JACK' AS VARCHAR2(100)) NAME
FROM dual
CONNECT BY LEVEL<100000;
--构造3行较少的值
INSERT INTO t VALUES(100000,'y','DINGJUN1');
INSERT INTO t VALUES(100001,'y','DINGJUN2');
INSERT INTO t VALUES(100002,'y','DINGJUN3');
COMMIT;
ALTER TABLE t MODIFY NAME NOT NULL;
--创建索引
CREATE INDEX idx_t ON t(NAME);
BEGIN
dbms_stats.gather_table_stats(ownname => USER,tabname => 'T',estimate_percent => 100,method_opt => 'for columns name size 10',cascade => TRUE);
END;
/
|
<>无法利用索引:
dingjun123@ORADB> SELECT * FROM t
2 WHERE t.name<>'JACK';
3 rows selected.
Elapsed: 00:00:00.01
Execution Plan
----------------------------------------------------------
Plan hash value: 1601196873
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 4 | 88 | 107 (2)| 00:00:02 |
|* 1 | TABLE ACCESS FULL| T | 4 | 88 | 107 (2)| 00:00:02 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("T"."NAME"<>'JACK')
|
将<>改写为OR连接后,能够正确使用索引,走OR扩展:
dingjun123@ORADB> SELECT * FROM t
2 WHERE t.NAME>'JACK' OR t.NAME <'JACK';
3 rows selected.
Elapsed: 00:00:00.06
Execution Plan
----------------------------------------------------------
Plan hash value: 2945726203
--------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 4 | 88 | 6 (0)| 00:00:01 |
| 1 | CONCATENATION | | | | | |
| 2 | TABLE ACCESS BY INDEX ROWID| T | 1 | 22 | 3 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN | IDX_T | 1 | | 2 (0)| 00:00:01 |
| 4 | TABLE ACCESS BY INDEX ROWID| T | 3 | 66 | 3 (0)| 00:00:01 |
|* 5 | INDEX RANGE SCAN | IDX_T | 4 | | 2 (0)| 00:00:01 |
--------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("T"."NAME">'JACK')
5 - access("T"."NAME"<'JACK')
filter(LNNVL("T"."NAME">'JACK'))
|
如果业务允许,改为写下列语句也是走索引的,不再演示。
SELECT * FROM t
WHERE t.NAME IN ('DINGJUN1','DINGJUN2','DINGJUN3');
NOT IN,NOT EXISTS与<>类似,也是走不了索引的(注意,这里说走不了索引,只针对于不等连接谓词,不包括其他谓词)
演示NOT IN:
DROP TABLE t1;
CREATE TABLE t1 AS SELECT 1 ID,'JACK' NAME FROM dual;
ALTER TABLE t1 MODIFY NAME NOT NULL;
BEGIN
dbms_stats.gather_table_stats(ownname => USER,tabname => 'T1',estimate_percent => 100);
END;
/
|
NOT IN也不能使用索引,强制使用只能扫描全部索引:
dingjun123@ORADB> SELECT * FROM t
2 WHERE t.NAME NOT IN(SELECT t1.NAME FROM t1);
Elapsed: 00:00:00.00
Execution Plan
----------------------------------------------------------
Plan hash value: 4253491563
-----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 100K| 2636K| 110 (2)| 00:00:02 |
|* 1 | HASH JOIN RIGHT ANTI| | 100K| 2636K| 110 (2)| 00:00:02 |
| 2 | TABLE ACCESS FULL | T1 | 1 | 5 | 3 (0)| 00:00:01 |
| 3 | TABLE ACCESS FULL | T | 100K| 2148K| 106 (1)| 00:00:02 |
-----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("T"."NAME"="T1"."NAME")
dingjun123@ORADB> SELECT/*+index(t idx_t)*/ * FROM t
2 WHERE t.NAME NOT IN(SELECT t1.NAME FROM t1);
Elapsed: 00:00:00.03
Execution Plan
----------------------------------------------------------
Plan hash value: 1707532828
--------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 100K| 2636K| 591 (1)| 00:00:08 |
| 1 | MERGE JOIN ANTI | | 100K| 2636K| 591 (1)| 00:00:08 |
| 2 | TABLE ACCESS BY INDEX ROWID| T | 100K| 2148K| 587 (1)| 00:00:08 |
| 3 | INDEX FULL SCAN | IDX_T | 100K| | 225 (1)| 00:00:03 |
|* 4 | SORT UNIQUE | | 1 | 5 | 4 (25)| 00:00:01 |
| 5 | TABLE ACCESS FULL | T1 | 1 | 5 | 3 (0)| 00:00:01 |
--------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - access("T"."NAME"="T1"."NAME")
filter("T"."NAME"="T1"."NAME")
|
NOT EXISTS与NOT IN类似(注意,NOT IN已经处理过NULL的影响)
SELECT * FROM t
WHERE NOT EXISTS(SELECT 1 FROM t1 WHERE t.NAME=t1.NAME);
2.由于B*Tree索引不存储全为NULL的行,造成无法走索引的情况
解决方法:
1)视业务要求是否可以不统计NULL的
加条件where xxx IS NOT NULL, COUNT(列).....等等过滤NULL的语句
2)复合列索引
3)函数索引
详细可以参考【Oracle index】B*Tree索引之NULL统计技巧 这个帖子里的内容。
3.LIKE前通配查询询,LIKE全通配查询走不了索引
解决方法:
1)是否可以根据业务需求把前通配去掉
dingjun123@ORADB> SELECT * FROM t WHERE t.NAME LIKE '%DINGJUN%';
Elapsed: 00:00:00.04
Execution Plan
----------------------------------------------------------
Plan hash value: 1601196873
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 5000 | 107K| 107 (2)| 00:00:02 |
|* 1 | TABLE ACCESS FULL| T | 5000 | 107K| 107 (2)| 00:00:02 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("T"."NAME" LIKE '%DINGJUN%')
--改为后通配,走索引
dingjun123@ORADB> SELECT * FROM t WHERE t.NAME LIKE 'DINGJUN%';
Elapsed: 00:00:00.00
Execution Plan
----------------------------------------------------------
Plan hash value: 1594971208
-------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 3 | 66 | 3 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| T | 3 | 66 | 3 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | IDX_T | 3 | | 2 (0)| 00:00:01 |
-------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("T"."NAME" LIKE 'DINGJUN%')
filter("T"."NAME" LIKE 'DINGJUN%')
|
2) 是否和此LIKE一样的前通配或全通配的SQL有很多,此谓词的LIKE变化不大,如果是,考虑建立函数索引,否则对于全通配问题最好办法就是全文索引
dingjun123@ORADB> CREATE INDEX idx1_t ON t (instr(NAME,'DINGJUN'));
Index created.
Elapsed: 00:00:00.16dingjun123@ORADB> SELECT * FROM t WHERE instr(t.NAME,'DINGJUN')>0;
Elapsed: 00:00:00.00
Execution Plan
----------------------------------------------------------
Plan hash value: 2071967826
--------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 5000 | 102K| 7 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| T | 5000 | 102K| 7 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | IDX1_T | 900 | | 3 (0)| 00:00:01 |
--------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access(INSTR("NAME",'DINGJUN')>0)
|
3)如果只是前通配,可以使用reverse函数索引(不是翻转键索引)
SELECT * FROM t WHERE t.NAME LIKE '%DINGJUN1';
CREATE INDEX idx2_t ON t(REVERSE(NAME));
语句要改写,使用reverse函数谓词,并且查找值倒查:
dingjun123@ORADB> SELECT * FROM t WHERE REVERSE(t.NAME) LIKE '1NUJGNID%';
Elapsed: 00:00:00.00
Execution Plan
----------------------------------------------------------
Plan hash value: 3787301248
--------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 5000 | 102K| 8 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| T | 5000 | 102K| 8 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | IDX2_T | 900 | | 4 (0)| 00:00:01 |
--------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access(REVERSE("NAME") LIKE '1NUJGNID%')
filter(REVERSE("NAME") LIKE '1NUJGNID%')
|
注意:如果查找的是中文,得注意,不可直接写 REVERSE(t.NAME) LIKE '1NUJGNID%',因为REVERSE内部会按字节翻转,中文的写法可以用REVERSE转换,比如
SELECT * FROM t WHERE REVERSE(t.name) LIKE REVERSE('数据')||'%';否则查询出的数据不对。
4.对索引列使用了函数,数学运算,其他表达式等
解决方法:去掉对索引列的相关运算,保持索引列纯净。
优化器目前对一些数学运算,还无法做很好的消除动作,所以对于索引列应该尽量保持纯净,否则可能无法用上正确的索引
dingjun123@ORADB> CREATE INDEX idx3_t ON t(ID);
Index created.
Elapsed: 00:00:00.13
dingjun123@ORADB> ALTER TABLE T MODIFY ID NOT NULL;
Table altered.
Elapsed: 00:00:00.10
dingjun123@ORADB> SELECT * FROM t WHERE ID+0=1;
Elapsed: 00:00:00.00
Execution Plan
----------------------------------------------------------
Plan hash value: 1601196873
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1000 | 21000 | 107 (2)| 00:00:02 |
|* 1 | TABLE ACCESS FULL| T | 1000 | 21000 | 107 (2)| 00:00:02 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("ID"+0=1)
dingjun123@ORADB> SELECT * FROM t WHERE ID=1;
Elapsed: 00:00:00.01
Execution Plan
----------------------------------------------------------
Plan hash value: 2351669764
--------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1000 | 21000 | 3 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| T | 1000 | 21000 | 3 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | IDX3_T | 400 | | 1 (0)| 00:00:01 |
--------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("ID"=1)
|
举例只是举简单的数学运算,可能运算还和其他列运算,比如where ID+ext_col...
5.ORACLE使用了隐式类型转换,导致索引无法使用
解决方法:
必须避免隐式类型转换,全部要求显式类型转换(非索引列),且避免对索引列进行类型转换(有函数索引除外)
如果类型不一致,不管是否发生自动类型转换,谓词的右值应该显式转换为与索引列保持一致(对于非索引列的运算也应该如此)。
dingjun123@ORADB> DROP TABLE t1;
Table dropped.
Elapsed: 00:00:00.03
dingjun123@ORADB> CREATE TABLE t1(x VARCHAR2(100));
Table created.
Elapsed: 00:00:00.04
dingjun123@ORADB> CREATE INDEX idx_t1 ON t1(x);
Index created.
Elapsed: 00:00:00.04
dingjun123@ORADB> SELECT * FROM t1 WHERE x = 1;
Elapsed: 00:00:00.00
Execution Plan
----------------------------------------------------------
Plan hash value: 3617692013
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 52 | 2 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| T1 | 1 | 52 | 2 (0)| 00:00:01 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(TO_NUMBER("X")=1)
dingjun123@ORADB> SELECT * FROM t1 WHERE x = '1';
Elapsed: 00:00:00.00
Execution Plan
----------------------------------------------------------
Plan hash value: 1369807930
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 52 | 1 (0)| 00:00:01 |
|* 1 | INDEX RANGE SCAN| IDX_T1 | 1 | 52 | 1 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("X"='1')
|
发生类型转换,可以通过执行计划中的谓词信息获知。通过分析发现,X因为是VARCHAR2,优先级比数值类型低,遇到数值类型,会TO_NUMBER,所以索引失效,通过传入与索引列类型一致的字符串后,解决。
6.对于其他不走索引的,尽量通过业务的逻辑特点改写语句,可能正确利用索引
例如一个典型的例子:
DROP TABLE t_objects;
CREATE TABLE t_objects
AS
SELECT * FROM dba_objects;
CREATE INDEX idx_t_objects ON t_objects(last_ddl_time,created);
BEGIN
dbms_stats.gather_table_stats(ownname => USER,tabname => 't_objects',estimate_percent => 100,cascade => TRUE);
END;
/
|
需求:查找创建时间是2013年的,并且最后ddl时间比创建时间大1天以上。
dingjun123@ORADB> SELECT * FROM
2 t_objects t
3 WHERE t.last_ddl_time-t.created>1
4 AND t.created>=DATE'2013-1-1';
Elapsed: 00:00:00.00
Execution Plan
----------------------------------------------------------
Plan hash value: 3629755566
-------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1171 | 110K| 300 (1)| 00:00:04 |
|* 1 | TABLE ACCESS FULL| T_OBJECTS | 1171 | 110K| 300 (1)| 00:00:04 |
-------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("T"."CREATED">=TO_DATE(' 2013-01-01 00:00:00',
'syyyy-mm-dd hh24:mi:ss') AND "T"."LAST_DDL_TIME"-"T"."CREATED">1)
|
这个索引是组合索引,上面的语句对前导列进行了运行,也不符合走index skip scan的条件,所以,走FULL TABLE SCAN。那么是否可以通过逻辑改写走索引呢,通过第4点得知,要保持索引列清净,将create_date移到右边,语句如下:
dingjun123@ORADB> SELECT * FROM
2 t_objects t
3 WHERE t.last_ddl_time>=(t.created+1)
4 AND t.created>=DATE'2013-1-1';
Elapsed: 00:00:00.00
Execution Plan
----------------------------------------------------------
Plan hash value: 3629755566
-------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 403 | 39091 | 301 (1)| 00:00:04 |
|* 1 | TABLE ACCESS FULL| T_OBJECTS | 403 | 39091 | 301 (1)| 00:00:04 |
-------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("T"."CREATED">=TO_DATE(' 2013-01-01 00:00:00',
'syyyy-mm-dd hh24:mi:ss') AND "T"."LAST_DDL_TIME">=INTERNAL_FUNCTION("T"
."CREATED")+1)
|
通过改写后发现,还是没有走索引,因为ORACLE认为前导列右边的created不固定,无法从指定索引处查找。通过这个分析得知,ORACLE谓词传递有一定限制,create_date+1无法做谓词传递给last_ddl_time,改写:
dingjun123@ORADB> SELECT * FROM
2 t_objects t
3 WHERE t.last_ddl_time>=(t.created)
4 AND t.last_ddl_time>=(t.created+1)
5 AND t.created>=DATE'2013-1-1';
Elapsed: 00:00:00.00
Execution Plan
----------------------------------------------------------
Plan hash value: 641904483
---------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2 | 194 | 11 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| T_OBJECTS | 2 | 194 | 11 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | IDX_T_OBJECTS | 6 | | 10 (0)| 00:00:01 |
---------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("T"."LAST_DDL_TIME">=TO_DATE(' 2013-01-01 00:00:00', 'syyyy-mm-dd
hh24:mi:ss') AND "T"."CREATED">=TO_DATE(' 2013-01-01 00:00:00', 'syyyy-mm-dd
hh24:mi:ss') AND "T"."LAST_DDL_TIME" IS NOT NULL)
filter("T"."CREATED">=TO_DATE(' 2013-01-01 00:00:00', 'syyyy-mm-dd
hh24:mi:ss') AND "T"."LAST_DDL_TIME">="T"."CREATED" AND
"T"."LAST_DDL_TIME">=INTERNAL_FUNCTION("T"."CREATED")+1)
|
上面的oracle知道谓词传递给last_ddl_time了,T"."LAST_DDL_TIME">=TO_DATE(' 2013-01-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss'),也可以手动谓词传递,last_ddl_time肯定大于等于DATE'2013-1-2':
dingjun123@ORADB> SELECT * FROM
2 t_objects t
3 WHERE t.last_ddl_time>=DATE'2013-1-2'
4 AND t.last_ddl_time>=(t.created+1)
5 AND t.created>=DATE'2013-1-1';
Elapsed: 00:00:00.00
Execution Plan
----------------------------------------------------------
Plan hash value: 641904483
---------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 13 | 1261 | 11 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| T_OBJECTS | 13 | 1261 | 11 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | IDX_T_OBJECTS | 13 | | 10 (0)| 00:00:01 |
---------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("T"."LAST_DDL_TIME">=TO_DATE(' 2013-01-02 00:00:00', 'syyyy-mm-dd
hh24:mi:ss') AND "T"."CREATED">=TO_DATE(' 2013-01-01 00:00:00', 'syyyy-mm-dd
hh24:mi:ss') AND "T"."LAST_DDL_TIME" IS NOT NULL)
filter("T"."CREATED">=TO_DATE(' 2013-01-01 00:00:00', 'syyyy-mm-dd
hh24:mi:ss') AND "T"."LAST_DDL_TIME">=INTERNAL_FUNCTION("T"."CREATED")+1)
|
当然,对于两个都是范围的查询,这里只能通过一个列来轮询索引,先做access,再做filter。SQL语句的逻辑改写很重要,往往通过逻辑改写就能改变SQL的执行计划,从不好的计划到好的计划,比如semi join,anti join与or,往往走FILTER导致执行计划较差,这时候就需要通过逻辑等价改写,逻辑等价改写往往需要掌握一些集合的知识,比如NOT (A AND B)===NOT A OR NOT B,NOT (A OR B)===NOT A AND NOT B等。
小小菜鸟一枚