在SQL tuning中,不良写法导致SQL执行效率比比皆是。最近的SQL tuning中一个外部连接写法不当导致过SQL执行时间超过15分钟左右此即

为一例。通过修正该SQL的写法之后,过高的逻辑读呈数量级下降以及SQL语句执行时间也大幅下降。下面给出一个列子来演示该情形。

一、创建演示环境

-->当前数据库版本
SQL> select * from v$version where rownum<2;

BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Prod

-->创建演示表并插入记录
SQL> create table t as select empno,ename,job,sal,deptno
2 from emp where 1=2;

SQL> insert into t select empno,ename,job,sal,deptno
2 from emp e where empno=(select max(empno) from emp where deptno=e.deptno);

SQL> insert into t(empno,ename,job,sal) values(8888,'ROBINSON','DBA',2000);

SQL> insert into t(empno,ename,job,sal) values(9999,'JACKSON','CLERK',2500);

SQL> commit;

SQL> analyze table t compute statistics;

SQL> select * from t;

EMPNO ENAME JOB SAL DEPTNO
---------- ---------- --------- ---------- ----------
7934 MILLER CLERK 1300 10
7902 FORD ANALYST 3000 20
7900 JAMES CLERK 950 30
8888 ROBINSON DBA 2000
9999 JACKSON CLERK 2500

-->使用left join连接查看数据,此时表t中所有记录被返回
SQL> select empno,ename,sal,dname from t left join dept d on t.deptno=d.deptno;

EMPNO ENAME SAL DNAME
---------- ---------- ---------- --------------
7934 MILLER 1300 ACCOUNTING
7902 FORD 3000 RESEARCH
7900 JAMES 950 SALES
8888 ROBINSON 2000
9999 JACKSON 2500

-->下面同样是使用left join连接,但在on子句中增加了过滤条件t.sal>=2000
-->从下面的返回结果可知,t.sal>=2000子句并没有过滤掉sal小于2000的记录
SQL> select empno,ename,sal,dname from t left join dept d -->简称语句A
2 on(t.deptno=d.deptno and t.sal>=2000);

EMPNO ENAME SAL DNAME
---------- ---------- ---------- --------------
7934 MILLER 1300
7902 FORD 3000 RESEARCH
7900 JAMES 950
8888 ROBINSON 2000
9999 JACKSON 2500

-->使用left join连接,将过滤条件放到where 子句中
-->此时仅仅t.sal>=2000且符合t.deptno=d.deptno的记录被返回(结果与所期望一致)
SQL> select empno,ename,sal,dname from t left join dept d -->简称语句B
2 on t.deptno=d.deptno where t.sal>=2000;

EMPNO ENAME SAL DNAME
---------- ---------- ---------- --------------
7902 FORD 3000 RESEARCH
8888 ROBINSON 2000
9999 JACKSON 2500

-->查看执行计划
SQL> set autotrace traceonly exp;

-->语句A(过滤条件位于on 子句中的情形)的执行计划
SQL> select empno,ename,sal,dname from t left join dept d
2 (on t.deptno=d.deptno and t.sal>=2000);

Execution Plan
----------------------------------------------------------
Plan hash value: 2195752858

-----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 5 | 120 | 6 (0)| 00:00:01 |
| 1 | NESTED LOOPS OUTER | | 5 | 120 | 6 (0)| 00:00:01 |
| 2 | TABLE ACCESS FULL | T | 5 | 70 | 3 (0)| 00:00:01 |
| 3 | VIEW | | 1 | 10 | 1 (0)| 00:00:01 |
| 4 | TABLE ACCESS BY INDEX ROWID| DEPT | 1 | 13 | 1 (0)| 00:00:01 |
|* 5 | INDEX UNIQUE SCAN | PK_DEPT | 1 | | 0 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

5 - access("T"."DEPTNO"="D"."DEPTNO") -->重点关注这里的谓词信息,两个过滤条件合在一起
filter("T"."DEPTNO" IS NOT NULL AND "T"."SAL">=2000) -->从执行计划来看位于第5步为INDEX UNIQUE SCAN

Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
11 consistent gets -->此时的逻辑读为11
0 physical reads
0 redo size
696 bytes sent via SQL*Net to client
385 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
5 rows processed

-->语句B(将谓词信息置于到where子句中的情形)的执行计划
SQL> select empno,ename,sal,dname from t left join dept d
2 on t.deptno=d.deptno where t.sal>=2000;

Execution Plan
----------------------------------------------------------
Plan hash value: 832694258

----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 3 | 81 | 4 (0)| 00:00:01 |
| 1 | NESTED LOOPS OUTER | | 3 | 81 | 4 (0)| 00:00:01 |
|* 2 | TABLE ACCESS FULL | T | 3 | 42 | 3 (0)| 00:00:01 |
| 3 | TABLE ACCESS BY INDEX ROWID| DEPT | 1 | 13 | 1 (0)| 00:00:01 |
|* 4 | INDEX UNIQUE SCAN | PK_DEPT | 1 | | 0 (0)| 00:00:01 |
----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - filter("T"."SAL">=2000) -->此时的谓词信息分为两部分,"T"."SAL">=2000位于第二步
4 - access("T"."DEPTNO"="D"."DEPTNO"(+)) -->此条谓词信息用于实现表连接

Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
10 consistent gets -->此时的逻辑读为10,由于2 - filter("T"."SAL">=2000)过滤后,内部循环少执行了一次
0 physical reads
0 redo size
658 bytes sent via SQL*Net to client
385 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
3 rows processed

/**************************************************/
/* Author: Robinson Cheng */
/* Blog: http://blog..net/robinson_0612 */
/* MSN: robinson_0612@hotmail.com */
/* QQ: 5746311 */
/**************************************************/

-->从上面的观察中发现上述两条SQL语句执行计划并非最佳,存在改良的余地
-->由于是nested loops outer,因此考虑在表t的谓词列增加索引以快速过滤记录

SQL> create index i_t_sal on t(sal);

SQL> exec dbms_stats.gather_table_stats('SCOTT','T',cascade=>true);

-->增加索引后两个语句的执行情况

-->语句A的执行计划以及统计信息没有发生任何变化
SQL> select empno,ename,sal,dname from t left join dept d
2 on (t.deptno=d.deptno and t.sal>=2000);

Execution Plan
----------------------------------------------------------
Plan hash value: 2195752858

-----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 5 | 140 | 6 (0)| 00:00:01 |
| 1 | NESTED LOOPS OUTER | | 5 | 140 | 6 (0)| 00:00:01 |
| 2 | TABLE ACCESS FULL | T | 5 | 90 | 3 (0)| 00:00:01 |
| 3 | VIEW | | 1 | 10 | 1 (0)| 00:00:01 |
| 4 | TABLE ACCESS BY INDEX ROWID| DEPT | 1 | 13 | 1 (0)| 00:00:01 |
|* 5 | INDEX UNIQUE SCAN | PK_DEPT | 1 | | 0 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

5 - access("T"."DEPTNO"="D"."DEPTNO")
filter("T"."DEPTNO" IS NOT NULL AND "T"."SAL">=2000)

Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
11 consistent gets
0 physical reads
0 redo size
696 bytes sent via SQL*Net to client
385 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
5 rows processed

-->语句B的执行计划发生变化,原来的全表扫描变为索引扫描
SQL> select empno,ename,sal,dname from t left join dept d
2 on t.deptno=d.deptno where t.sal>=2000;

Execution Plan
----------------------------------------------------------
Plan hash value: 2452308905

----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 3 | 93 | 3 (0)| 00:00:01 |
| 1 | NESTED LOOPS OUTER | | 3 | 93 | 3 (0)| 00:00:01 |
| 2 | TABLE ACCESS BY INDEX ROWID| T | 3 | 54 | 2 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN | I_T_SAL | 3 | | 1 (0)| 00:00:01 |
| 4 | TABLE ACCESS BY INDEX ROWID| DEPT | 1 | 13 | 1 (0)| 00:00:01 |
|* 5 | INDEX UNIQUE SCAN | PK_DEPT | 1 | | 0 (0)| 00:00:01 |
----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

3 - access("T"."SAL">=2000)
5 - access("T"."DEPTNO"="D"."DEPTNO"(+))

Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
6 consistent gets -->逻辑读也由10下降到6
0 physical reads
0 redo size
658 bytes sent via SQL*Net to client
385 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
3 rows processed

二、总结
  1、尽可能避免SQL不良写法导致的不良后果

  2、此例中由于将谓词信息放到ON子句中,在数据量庞大的表(百万行)连接中,则该写法导致过多的物理和逻辑I/O,使得中间结果集庞大

  3、谓词信息放到ON子句中同时也导致索引失效

  4、尽可能的在满足需求的情况下减小中间结果集