分页格式1:
SELECT * FROM (SELECT A.*, ROWNUM RN FROM (SELECT * FROM TEST_DISTINCT) A WHERE ROWNUM <= 40) WHERE RN >= 21
该语句最内层SELECT * FROM TEST_DISTINCT表示不翻页遍历整张表,ROWNUM <= 40和RN >= 21控制每页查询的范围。分页的目的就是为了控制查询语句的结果集,上面语句的重点就在ROWNUM <= 40。
格式1展示的是在第二层控制结果集的最大值,在第三次控制结果集的最小值,控制分页的另外一种方式可以在最外层直接控制结果集的范围。
分页格式2:
SELECT * FROM (SELECT A.*, ROWNUM RN FROM (SELECT * FROM TEST_DISTINCT) A) WHERE RN BETWEEN 21 AND 40
对比两个语句的执行计划,发现通常情况下,格式1比格式2的效率更高。
格式1 SQL> SELECT * FROM (SELECT A.*, ROWNUM RN FROM (SELECT * FROM TEST_DISTINCT) A WHERE ROWNUM <= 40) WHERE RN >= 21 2 ; 20 rows selected. Execution Plan ---------------------------------------------------------- Plan hash value: 3629022454 ------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 40 | 1040 | 2 (0)| 00:00:01 | |* 1 | VIEW | | 40 | 1040 | 2 (0)| 00:00:01 | |* 2 | COUNT STOPKEY | | | | | | | 3 | TABLE ACCESS FULL| TEST_DISTINCT | 101K| 1287K| 2 (0)| 00:00:01 | ------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("RN">=21) 2 - filter(ROWNUM<=40) Note ----- - dynamic sampling used for this statement (level=2) Statistics ---------------------------------------------------------- 7 recursive calls 0 db block gets 144 consistent gets 128 physical reads 0 redo size 1012 bytes sent via SQL*Net to client 534 bytes received via SQL*Net from client 3 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 20 rows processed 格式2 SQL> SELECT * 2 FROM (SELECT A.*, ROWNUM RN FROM (SELECT * FROM TEST_DISTINCT) A) 3 WHERE RN BETWEEN 21 AND 40; 20 rows selected. Execution Plan ---------------------------------------------------------- Plan hash value: 2746576408 ------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 101K| 2574K| 44 (3)| 00:00:01 | |* 1 | VIEW | | 101K| 2574K| 44 (3)| 00:00:01 | | 2 | COUNT | | | | | | | 3 | TABLE ACCESS FULL| TEST_DISTINCT | 101K| 1287K| 44 (3)| 00:00:01 | ------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("RN"<=40 AND "RN">=21) Note ----- - dynamic sampling used for this statement (level=2) Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 158 consistent gets 153 physical reads 0 redo size 1012 bytes sent via SQL*Net to client 534 bytes received via SQL*Net from client 3 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 20 rows processed
从上面的执行计划比对可以看出,因为格式1的过滤发生在第二层,则内部的结果满足条件后就停止了查询,所以返回的效率相比格式2要高出很多。
另外,简单讨论一下多表等值连接的情况。对于常见的等值表连接查询,CBO一般采用两种相对高效的连接方式NESTED LOOP和HASH JOIN(MERGE JOIN效率比HASH JOIN低,CBO一般不优先考虑)。在这里,由于使用了分页,通过指定的条件限定了结果集的最大值,NESTED LOOP的情况下,当记录数超过条件时可以马上返回结果集,而HASH JOIN必须处理完所有结果集。因此对于大多数情况的分页查询,使用NESTED LOOP作为查询的连接方式效率更高。
强制Oracle使用CBO可以通过Hint,但查看了下这样的结果执行效率反而下降
SELECT /*+ first_rows */ * FROM (SELECT A.*, ROWNUM RN FROM (SELECT * FROM TEST_DISTINCT) A WHERE ROWNUM <= 40) WHERE RN >= 21 Execution Plan ---------------------------------------------------------- Plan hash value: 3629022454 ------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 40 | 1040 | 44 (3)| 00:00:01 | |* 1 | VIEW | | 40 | 1040 | 44 (3)| 00:00:01 | |* 2 | COUNT STOPKEY | | | | | | | 3 | TABLE ACCESS FULL| TEST_DISTINCT | 101K| 1287K| 44 (3)| 00:00:01 | -------------------------------------------------------------------------------------
创建一个大表用于模拟生产环境测试
SQL> create table t as select * from dba_objects,dba_sequences; SQL> select count(*) from t; COUNT(*) ---------- 16755730 SQL> SET AUTOT ON SQL> COL OBJECT_NAME FORMAT A30 SQL> exec dbms_stats.gather_table_stats('SYS','T');
对于格式2,可以看到执行计划消耗非常高。
Execution Plan ---------------------------------------------------------- Plan hash value: 1961128543 ---------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 16M| 1470M| 99913 (1)| 00:19:59 | |* 1 | VIEW | | 16M| 1470M| 99913 (1)| 00:19:59 | | 2 | COUNT | | | | | | | 3 | TABLE ACCESS FULL| T | 16M| 479M| 99913 (1)| 00:19:59 | ---------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("RN"<=20 AND "RN">=11) Statistics ---------------------------------------------------------- 1 recursive calls 0 db block gets 368199 consistent gets 368191 physical reads 0 redo size 806 bytes sent via SQL*Net to client 523 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 10 rows processed
而在格式1,效率有明显的提升,一共只发生了96次物理读操作。
select object_id, object_name from (select rownum rn, object_id, object_name from (select object_id, object_name from t) where rownum <= 20) where rn >= 11; Execution Plan ---------------------------------------------------------- Plan hash value: 3593519476 ---------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 20 | 1840 | 2 (0)| 00:00:01 | |* 1 | VIEW | | 20 | 1840 | 2 (0)| 00:00:01 | |* 2 | COUNT STOPKEY | | | | | | | 3 | TABLE ACCESS FULL| T | 20 | 600 | 2 (0)| 00:00:01 | ---------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("RN">=11) 2 - filter(ROWNUM<=20) Statistics ---------------------------------------------------------- 1 recursive calls 0 db block gets 5 consistent gets 96 physical reads 0 redo size 806 bytes sent via SQL*Net to client 523 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 10 rows processed
但是格式1的分页如果过于靠后,几乎接近于表的所有行时,其效率与格式2也没有多大的差距了
select object_id, object_name from (select rownum rn, object_id, object_name from (select object_id, object_name from t) where rownum <= 16000011 ) where rn >= 16000000; Execution Plan ---------------------------------------------------------- Plan hash value: 3593519476 ---------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 16M| 1403M| 95407 (1)| 00:19:05 | |* 1 | VIEW | | 16M| 1403M| 95407 (1)| 00:19:05 | |* 2 | COUNT STOPKEY | | | | | | | 3 | TABLE ACCESS FULL| T | 16M| 457M| 95407 (1)| 00:19:05 | ---------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("RN">=16000000) 2 - filter(ROWNUM<=16000011) Statistics ---------------------------------------------------------- 1 recursive calls 0 db block gets 351909 consistent gets 352128 physical reads 0 redo size 953 bytes sent via SQL*Net to client 523 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 12 rows processed