文档课题:ORACLE数据库高水位线(high water mark).
数据库:oracle 11.2.0.4
1、实验测试
1.1、建测试表
SQL> create user leo identified by leo;

User created.

SQL> grant dba to leo;

Grant succeeded.

SQL> conn leo/leo;
Connected.

SQL> create table hsw as select * from dba_objects;

Table created.

SQL> insert into hsw select * from hsw;

86309 rows created.

SQL> insert into hsw select * from hsw;

172618 rows created.

SQL> commit;

Commit complete.

SQL> select count(*) from hsw;

  COUNT(*)
----------
    345236

SQL> conn / as sysdba        
Connected.

SQL> col owner for a15
SQL> select owner,segment_name,segment_type,bytes/1024/1024,blocks,tablespace_name from dba_segments where segment_name='HSW';

OWNER           SEGMENT_NAME         SEGMENT_TYPE       BYTES/1024/1024     BLOCKS TABLESPACE_NAME
--------------- -------------------- ------------------ --------------- ---------- ------------------------------
LEO             HSW                  TABLE                           40       5120 USERS

1.2、查看表信息
说明:要查看准确的高水位信息,须先收集统计信息.
SQL> execute dbms_stats.gather_table_stats('LEO','HSW',cascade => true,no_invalidate => false);

PL/SQL procedure successfully completed.

SQL> alter session set nls_date_format='yyyy-mm-dd hh24:mi:ss';

Session altered.

SQL> select t.table_name,t.num_rows,t.blocks,t.empty_blocks,t.last_analyzed from dba_tables t where table_name in ('HSW');

TABLE_NAME                       NUM_ROWS     BLOCKS EMPTY_BLOCKS LAST_ANALYZED
------------------------------ ---------- ---------- ------------ -------------------
HSW                                345236       5060           60 2023-12-10 12:35:02

SQL> select count(distinct dbms_rowid.rowid_block_number(rowid)) used_block from leo.hsw;

USED_BLOCK
----------
      4916
	  
说明:结果显示当前表行数为345236,有5060个数据块被使用(HWM下的数据块),有60个未使用的数据块(HWM上的数据块).
      实际数据占用的数据块数量为:4916
综上,高水位线线有5060-4916=144个数据块可以释放.

2、创建高水位表
2.1、delete数据
说明:删除20w行数据后确认高水位线变化情况.
SQL> conn leo/leo;
Connected.
SQL> select count(*) from hsw;

  COUNT(*)
----------
    345236

SQL> delete from hsw where rownum<200001;

200000 rows deleted.

SQL> commit;

Commit complete.

SQL> select count(*) from hsw;

  COUNT(*)
----------
    145236

2.2、收集表统计信息收集
SQL> conn / as sysdba
Connected.
SQL> execute dbms_stats.gather_table_stats('LEO','HSW',cascade => true,no_invalidate => false);

PL/SQL procedure successfully completed.

2.3、查看表信息
SQL> col OWNER for a15
SQL> col SEGMENT_NAME for a15
SQL> select owner,segment_name,segment_type,bytes/1024/1024,blocks,tablespace_name from dba_segments where segment_name='HSW';

OWNER           SEGMENT_NAME    SEGMENT_TYPE       BYTES/1024/1024     BLOCKS TABLESPACE_NAME
--------------- --------------- ------------------ --------------- ---------- ------------------------------
LEO             HSW             TABLE                           40       5120 USERS

SQL> alter session set nls_date_format='yyyy-mm-dd hh24:mi:ss';

Session altered.

SQL> select t.table_name,t.num_rows,t.blocks,t.empty_blocks,t.last_analyzed from dba_tables t where table_name in ('HSW');

TABLE_NAME                       NUM_ROWS     BLOCKS EMPTY_BLOCKS LAST_ANALYZED
------------------------------ ---------- ---------- ------------ -------------------
HSW                                145236       5060           60 2023-12-10 12:49:20

SQL> select count(distinct dbms_rowid.rowid_block_number(rowid)) used_block from leo.hsw;

USED_BLOCK
----------
      2078
	  
说明:结果显示当前表行数为145236,有5060个数据块被使用(HWM下的数据块),有60个未使用的数据块(HWM上的数据块).
      实际数据占用的数据块数量为:2078
综上,高水位线线有5060-2078=2982个数据块可以释放,该表段大小为40M,记录145236行数据.

delete并不能回收表数据,在增删改数据的过程中使得高水位线持续增长,而被删除的数据记录位置也无法100%复用,因此难免会存在碎片.如一张100万数据的表,将全表数据delete之后,高水位线位置依旧在第100万行处,尽管此时表数据为0行,但全表扫描时仍会扫描所有已使用过的数据块,使数据库效率低下.

3、高水位对象统计
A、比较表的行数和表的大小关系,如果行数为0,而表的当前占用大小减去初始化时的大小(INITIAL_EXTENT)却很大,那么该表存在高水位;
B、行数和块数的比率,即查看一个块可以存储多少行数据,如果一个块存储的行数少于5行甚至更少,则说明有高水位。注意,这两种方法都不是十分准确,需要再对查询结果进行筛选。需要注意的是,在查询表的高水位时,首先需要分析表,以得到最准确的统计信息。

ELECT D.OWNER,
      ROUND(D.NUM_ROWS / D.BLOCKS, 2),
      D.NUM_ROWS,
      D.BLOCKS,
      D.TABLE_NAME,
      ROUND((d.BLOCKS * 8 - D.INITIAL_EXTENT / 1024) / 1024) t_size
 FROM DBA_TABLES D
WHERE D.BLOCKS > 1
  and d.OWNER='LEO';

OWNER           ROUND(D.NUM_ROWS/D.BLOCKS,2)   NUM_ROWS     BLOCKS TABLE_NAME                         T_SIZE
--------------- ---------------------------- ---------- ---------- ------------------------------ ----------
LEO                                     28.7     145236       5060 HSW                                    39

SELECT OWNER,
        SEGMENT_NAME TABLE_NAME,
        SEGMENT_TYPE,
        GREATEST(ROUND(100 * (NVL(HWM - AVG_USED_BLOCKS, 0) /
                       GREATEST(NVL(HWM, 1), 1)),
                       2),
                 0) WASTE_PER
   FROM (SELECT A.OWNER OWNER,
                A.SEGMENT_NAME,
                A.SEGMENT_TYPE,
                B.LAST_ANALYZED,
                A.BYTES,
                B.NUM_ROWS,
                A.BLOCKS BLOCKS,
                B.EMPTY_BLOCKS EMPTY_BLOCKS,
                A.BLOCKS - B.EMPTY_BLOCKS - 1 HWM,
                DECODE(ROUND((B.AVG_ROW_LEN * NUM_ROWS *
                             (1 + (PCT_FREE / 100))) / C.BLOCKSIZE,
                             0),
                       0,
                       1,
                       ROUND((B.AVG_ROW_LEN * NUM_ROWS *
                             (1 + (PCT_FREE / 100))) / C.BLOCKSIZE,
                             0)) + 2 AVG_USED_BLOCKS,
                ROUND(100 *
                      (NVL(B.CHAIN_CNT, 0) / GREATEST(NVL(B.NUM_ROWS, 1), 1)),
                      2) CHAIN_PER,
                B.TABLESPACE_NAME O_TABLESPACE_NAME
           FROM SYS.DBA_SEGMENTS A, SYS.DBA_TABLES B, SYS.TS$ C
          WHERE A.OWNER = B.OWNER
            AND SEGMENT_NAME = TABLE_NAME
            AND SEGMENT_TYPE = 'TABLE'
            AND B.TABLESPACE_NAME = C.NAME)
  WHERE GREATEST(ROUND(100 * (NVL(HWM - AVG_USED_BLOCKS, 0) /
                       GREATEST(NVL(HWM, 1), 1)),
                       2),
                 0) > 50
    AND OWNER = 'LEO'
    AND BLOCKS > 100
  ORDER BY WASTE_PER DESC

OWNER           TABLE_NAME      SEGMENT_TYPE        WASTE_PER
--------------- --------------- ------------------ ----------
LEO             HSW             TABLE                   62.63

4、回收高水位线
oracle提供如下回收高水位线的方法.
a、表重建, 如CATS(create table as select ...)
b、导出导入(exp/imp,expdp/impdp)
c、truncate (注意:此方法慎用)
d、shrink space,语句alter table table_name shrink space;该方法执行前需开启行移动,alter table table_name enable row movement;
e、move table,语句alter table table_name move;该方法可以释放高水位,但需要重建索引.
f、DBMS_REDEFINITION表在线重定义
5、降低高水位后相关影响
--执行计划查看
SQL> set timing on 
SQL> select count(*) from leo.hsw;

  COUNT(*)
----------
    145236
	
SQL> select count(*) from leo.hsw;

Elapsed: 00:00:00.11

Execution Plan
----------------------------------------------------------
Plan hash value: 131053992

-------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Cost (%CPU)| Time     |
-------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |  1373   (1)| 00:00:17 |
|   1 |  SORT AGGREGATE    |      |     1 |            |          |
|   2 |   TABLE ACCESS FULL| HSW  |   145K|  1373   (1)| 00:00:17 |
-------------------------------------------------------------------


Statistics
----------------------------------------------------------
         51  recursive calls
          0  db block gets
       5002  consistent gets
       4982  physical reads
          0  redo size
        528  bytes sent via SQL*Net to client
        520  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          5  sorts (memory)
          0  sorts (disk)
          1  rows processed
		  
--表移动
SQL> alter table leo.hsw move;

Table altered.

Elapsed: 00:00:23.16
SQL> alter system flush buffer_cache;

System altered.

Elapsed: 00:00:00.16
SQL> alter system flush shared_pool;

System altered.

Elapsed: 00:00:00.02
SQL> select count(*) from leo.hsw;

Elapsed: 00:00:00.13

Execution Plan
----------------------------------------------------------
Plan hash value: 131053992

-------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Cost (%CPU)| Time     |
-------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |  1373   (1)| 00:00:17 |
|   1 |  SORT AGGREGATE    |      |     1 |            |          |
|   2 |   TABLE ACCESS FULL| HSW  |   145K|  1373   (1)| 00:00:17 |
-------------------------------------------------------------------


Statistics
----------------------------------------------------------
         48  recursive calls
          1  db block gets
       2118  consistent gets
       2092  physical reads
         96  redo size
        528  bytes sent via SQL*Net to client
        520  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          5  sorts (memory)
          0  sorts (disk)
          1  rows processed
		  
说明:可以看到回收完高水位后逻辑读和物理读已经降下来.

SELECT OWNER,
      SEGMENT_NAME TABLE_NAME,
      SEGMENT_TYPE,
      GREATEST(ROUND(100 * (NVL(HWM - AVG_USED_BLOCKS, 0) /
                     GREATEST(NVL(HWM, 1), 1)),
                     2),
               0) WASTE_PER
 FROM (SELECT A.OWNER OWNER,
              A.SEGMENT_NAME,
              A.SEGMENT_TYPE,
              B.LAST_ANALYZED,
              A.BYTES,
              B.NUM_ROWS,
              A.BLOCKS BLOCKS,
              B.EMPTY_BLOCKS EMPTY_BLOCKS,
              A.BLOCKS - B.EMPTY_BLOCKS - 1 HWM,
              DECODE(ROUND((B.AVG_ROW_LEN * NUM_ROWS *
                           (1 + (PCT_FREE / 100))) / C.BLOCKSIZE,
                           0),
                     0,
                     1,
                     ROUND((B.AVG_ROW_LEN * NUM_ROWS *
                           (1 + (PCT_FREE / 100))) / C.BLOCKSIZE,
                           0)) + 2 AVG_USED_BLOCKS,
              ROUND(100 *
                    (NVL(B.CHAIN_CNT, 0) / GREATEST(NVL(B.NUM_ROWS, 1), 1)),
                    2) CHAIN_PER,
              B.TABLESPACE_NAME O_TABLESPACE_NAME
         FROM SYS.DBA_SEGMENTS A, SYS.DBA_TABLES B, SYS.TS$ C
        WHERE A.OWNER = B.OWNER
          AND SEGMENT_NAME = TABLE_NAME
          AND SEGMENT_TYPE = 'TABLE'
          AND B.TABLESPACE_NAME = C.NAME)
WHERE GREATEST(ROUND(100 * (NVL(HWM - AVG_USED_BLOCKS, 0) /
                     GREATEST(NVL(HWM, 1), 1)),
                     2),
               0) > 50
  AND OWNER = 'LEO'
  AND BLOCKS > 100
ORDER BY WASTE_PER DESC;

no rows selected

Elapsed: 00:00:00.10

说明:可以看到回收完高水位后该查询无显示.

参考网址:
https://cloud.tencent.com/developer/article/2074960