##第四. 索引

###4.1 定义

索引是与表关联的可选结构。通过创建索引可提高数据更新和检索的性能。Oracle 索引提供到数据行的直接访问路径。

可以对表的一个或多个列创建索引。创建索引后,Oracle 服务器会自动维护和使用索引。表数据的更新(如添加新行、更新行或删除行)会自动传播到所有相关的索引,这些对用户来说是完全透明的。

索引还可以提高实施主键和唯一键约束条件时的性能。如果没有索引,则每次对表执行DML 操作时都会扫描整个表(全表扫描)。

###4.2 类型

有多种类型的索引结构,可以根据需要使用。最常用的两种 类型是:

####4.2.1 B 树索引

默认的索引类型; 采用平衡树的形式. B 树索引的键值存储在平衡树(B 树)中,这样可以快速执行二进制搜索。


B 树索引的结构

索引的顶层为根,它包含指向索引中下一层次的条目。下一层次为分支块,它又指向位于索引中下一层次的块。最底层是叶节点,它包含指向表行的索引条目。叶块是双向关联的,这便于按键值升序或降序扫描索引。


索引叶条目的格式

  • 条目头:存储列数和锁定信息
  • 键列长度/值对:用于定义键中的列大小,后面跟随列值(此类长度/值对的数目就是索引中的最大列数)。
  • ROWID:包含键值的行的行ID


索引叶条目的特性

在非分区表的B 树索引中:

  • 当多个行具有相同的键值时,如果不压缩索引,键值会出现重复
  • 当某行包含的所有键列为NULL 时,该行没有对应的索引条目。因此,当WHERE 子句指定了NULL 时,将始终执行全表扫描
  • 因为所有行属于同一个段,所以要使用受限的ROWID 指向表行


对索引执行DML 操作的效果

对表执行DML 操作时,Oracle 服务器会维护所有索引。下面说明对索引执行DML 命令产生的效果:

  • 执行插入操作导致在相应块中插入索引条目。
  • 删除一行只导致对索引条目进行逻辑删除。已删除行所占用的空间不可供后面新的叶条目使用。
  • 更新键列导致对索引进行逻辑删除和插入。PCTFREE设置对索引没有影响,但创建时除外。即使索引块的空间少于PCTFREE指定的空间,也可以向索引块添加新条目。

###4.3 创建

#创建索引
create index emp3_name_ix on
emp3(emp3_name);

#查看索引的信息
select index_name, index_type, table_name, table_type, uniqueness, status
from user_indexes
where table_name = 'EMP3';

INDEX_NAME      INDEX_TYPE      TABLE_NAME      TABLE_TYPE  UNIQUENES STATUS
--------------- --------------- --------------- ----------- --------- --------
EMP3_ID_PK      NORMAL          EMP3            TABLE       UNIQUE    VALID
EMP3_NAME_IX    NORMAL          EMP3            TABLE       NONUNIQUE VALID

#查看索引对应的列
SQL> select * from user_ind_columns where table_name = 'EMP3';

INDEX_NAME      TABLE_NAME      COLUMN_NAME     COLUMN_POSITION COLUMN_LENGTH CHAR_LENGTH DESC
--------------- --------------- --------------- --------------- ------------- ----------- ----
EMP3_ID_PK      EMP3            EMP3_ID                       1            22           0 ASC
EMP3_NAME_IX    EMP3            EMP3_NAME                     1            10          10 ASC


SQL> select * from emp3 where emp3_name = 'qa1';

   EMP3_ID EMP3_NAME      DEP_ID
---------- ---------- ----------
         2 qa1                 2


Execution Plan
----------------------------------------------------------
Plan hash value: 215206995

--------------------------------------------------------------------------------------------
| Id  | Operation                   | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |              |     1 |    33 |     1   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| EMP3         |     1 |    33 |     1   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | EMP3_NAME_IX |     1 |       |     1   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("EMP3_NAME"='qa1')


Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
          3  consistent gets
          0  physical reads
          0  redo size
        675  bytes sent via SQL*Net to client
        524  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

###4.4 决定使用全表扫描还是使用索引

在大多数情况下,全表扫描可能会导致更多的物理磁盘输入输出,但是全表扫描有时又可能会因为高度并行化的存在而执行的更快。

索引范围扫描的总体原则是:

  • 对于原始排序的表仅读取少于表记录数40%的查询应该使用索引范围扫描。 反之,读取记录数目多于表记录数的40%的查询应该使用全表扫描。
  • 对于未排序的表仅读取少于表记录数7%的查询应该使用索引范围扫描。 反之,读取记录数目多于表记录数的7%的查询应该使用全表扫描。
###决定使用全表扫描还是使用索引
		 
SQL> select index_name, index_type, table_name, uniqueness, status from user_indexes where table_name = 'EMP3';

INDEX_NAME      INDEX_TYPE TABLE_NAME      UNIQUENESS STATUS
--------------- ---------- --------------- ---------- --------
EMP3_ID_PK      NORMAL     EMP3            UNIQUE     VALID
EMP3_NAME_IX    NORMAL     EMP3            NONUNIQUE  VALID

SQL> select count(*) from emp3;

  COUNT(*)
----------
       19

#虽然有索引,但是此时是全表扫描
SQL> select * from emp3 where emp3_name = 'qa8';

   EMP3_ID EMP3_NAME      DEP_ID
---------- ---------- ----------
        16 qa8                 2


Execution Plan
----------------------------------------------------------
Plan hash value: 2425169977

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     1 |    11 |     2   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| EMP3 |     1 |    11 |     2   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("EMP3_NAME"='qa8')


Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
          3  consistent gets
          0  physical reads
          0  redo size
        671  bytes sent via SQL*Net to client
        524  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

#往emp3表添加数据
[root@hzvscmdb sql]# more insert_data.sql
#!/bin/bash
i=$1;

while [ $i -le $2 ]
do
sqlplus hr/CCM%lab123@tony1521 <<EOF
insert into emp3 values($i,'$3',$4);
commit;
quit;
EOF
let i=i+1
done

echo "inset into emp3 table"

[root@hzvscmdb sql]# ./insert_data.sql 90 100 dev 1

SQL> select max(emp3_id) from emp3;

MAX(EMP3_ID)
------------
      100000

SQL> analyze table emp3 estimate statistics;

Table analyzed.

SQL> select blocks, empty_blocks, num_rows from user_tables where table_name = 'EMP3';

    BLOCKS EMPTY_BLOCKS   NUM_ROWS
---------- ------------ ----------
       374           10     101081

#查看一条数据
SQL> select * from emp3 where emp3_name = 'qa33333';

   EMP3_ID EMP3_NAME      DEP_ID
---------- ---------- ----------
     33333 qa33333             2


Execution Plan
----------------------------------------------------------
Plan hash value: 215206995

--------------------------------------------------------------------------------------------
| Id  | Operation                   | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |              |     1 |    14 |     2   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| EMP3         |     1 |    14 |     2   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | EMP3_NAME_IX |     1 |       |     1   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("EMP3_NAME"='qa33333')


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          4  consistent gets
          0  physical reads
          0  redo size
        681  bytes sent via SQL*Net to client
        524  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed


#不加条件的查询,索引扫描
SQL> select count(*) from emp3;

  COUNT(*)
----------
    100000


Execution Plan
----------------------------------------------------------
Plan hash value: 2418373429

----------------------------------------------------------------------------
| Id  | Operation             | Name       | Rows  | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |            |     1 |    70   (2)| 00:00:01 |
|   1 |  SORT AGGREGATE       |            |     1 |            |          |
|   2 |   INDEX FAST FULL SCAN| EMP3_ID_PK |   101K|    70   (2)| 00:00:01 |
----------------------------------------------------------------------------


Statistics
----------------------------------------------------------
          2  recursive calls
          2  db block gets
        262  consistent gets
         37  physical reads
        176  redo size
        526  bytes sent via SQL*Net to client
        524  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

#没有创建索引的,查询全表扫描
SQL> select count(*) from emp3 where dep_id = 2;

  COUNT(*)
----------
     85726


Execution Plan
----------------------------------------------------------
Plan hash value: 1396384608

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |     2 |   104   (1)| 00:00:02 |
|   1 |  SORT AGGREGATE    |      |     1 |     2 |            |          |
|*  2 |   TABLE ACCESS FULL| EMP3 | 50541 |    98K|   104   (1)| 00:00:02 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("DEP_ID"=2)


Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
        373  consistent gets
          0  physical reads
          0  redo size
        528  bytes sent via SQL*Net to client
        524  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed


SQL> select count(*) from emp3 where emp3_name like 'qa%';

  COUNT(*)
----------
     85726


Execution Plan
----------------------------------------------------------
Plan hash value: 3884997069

----------------------------------------------------------------------------------
| Id  | Operation         | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |              |     1 |     8 |     2   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE   |              |     1 |     8 |            |          |
|*  2 |   INDEX RANGE SCAN| EMP3_NAME_IX |     7 |    56 |     2   (0)| 00:00:01 |
----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("EMP3_NAME" LIKE 'qa%')
       filter("EMP3_NAME" LIKE 'qa%')


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        395  consistent gets
          0  physical reads
          0  redo size
        528  bytes sent via SQL*Net to client
        524  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed