1.什么是表膨胀
表膨胀是指表的数据和索引所占文件系统的空间,在有效数据量并未发生大的变化的情况下,不断增大。久而久之,关系文件被大量空洞填满,浪费了大量的磁盘空间。甚至某些特殊场景下,一个表中只有一条简单的数据,但是表对应的物理文件可能已经达到M级甚至G级。那么危害是什么?
1).空间持续上涨,到达某一个点后,需要执行一个高额代价的vacuum full(或者cluster等可以重组表的命令),但vacuum full又是AccessExclusiveLock,8级锁,会阻塞一切访问,意味着在完成清理重组之前,都无法访问该表。
2).扫描的效率变低,即使所有记录都是dead状态,PostgreSQL的顺序扫描也会扫描对象所有的老版本,直到执行vacuum将dead的记录删除
2.为什么会表膨胀
这是因为PostgreSQL的独特MVCC机制。业界主流的MVCC一般有三种实现方式:
1).以Oracle和Innodb为代表的,写新数据时,把旧数据转移到一个单独的地方,如回滚段中,其他人读数据时,从回滚段中把旧的数据读出来,所以可以有效避免膨胀。
2).以SQL Server为代表的,把旧版本的数据写入专门的临时表空间,新数据写入日志,然后去更改数据。这种方式,旧版本的数据放入了专门的临时表空间,所以也可以有效地避免膨胀。
3).写新数据时,旧数据不删除,而是把新数据插入,将旧数据标记为无效,PostgreSQL就是使用的这种实现方法,新老数据存放在一起,在被清理之前,会一直占据着空间,所以会导致膨胀。
3.PostgreSQL的MVCC实现方式优缺点如下
优点
1)无论事务进行了多少操作,事务回滚可以立即完成,Oracle中使用了回滚段,如Oracle数据库宕机时如果有很多事务正在运行,这时数据库再启动后,需要把之前的事务做回滚,当没有回滚完成时,数据行上仍然有锁的,这时业务仍然不能正常操作,如果恰好碰到要回滚一些很大的事务,情况会更坏。
2)数据可以进行很多更新,不必像Oracle和MySQL的Innodb引擎那样需要经常保证回滚段不会被用完,也不会像oracle数据库那样经常遇到“ORA-1555”错误的困扰
缺点
1)旧版本的数据需要清理。好在在v8.3中,PostgreSQL引入了自动化的autovacuum,采用多进程架构,支持多表同时操作
2)旧版本的数据可能会导致查询需要扫描的数据块增多,从而导致查询变慢
3)空间持续上涨,存储没有被有效利用
4.MVCC实现机制
前面也介绍了数据膨胀的原因,需要先了解一下数据基于MVCC的存储机制:
1)插入很简单,就是将元组插入到页面的空闲空间中;
2)删除则是将元组标记为旧版本,但是即使这个旧版本对所有事务都不可见了,这个元组占用的空间也不会归还给文件系统
3)UPDATE相当于DELETE + INSERT,等于是占用了两条元组的位置,类似DELETE,旧版本的元组依然占用着物理空间。
4)很明显,在一通增删改操作之后,页面上的旧版本元组势必是占有一定比重的。这就导致了物理文件大小明显高于实际的数据量。
为此,PostgreSQL引入了vacuum的机制,去清理那些不再需要的死元组。前文也介绍过,vacuum分为普通的vacuum和vacuum full。
普通的vacuum会做大概如下几件事情:
1)清除UPDATE或DELETE操作后留下的"死元组"
2)跟踪表块中可用空间,更新free space map
3)更新visibility map,index only scan以及后续vacuum都会利用到
4)"冻结"表中的行,防止事务ID回卷
5)配合ANALYZE,定期更新统计信息
可以看到,普通的vacuum只是清理死元组,"腾出"空间,在今后有新数据插入时,将优先使用该空间,
直到所有被重用的空间用完时,再考虑新增页面。但是每个页面的空闲空间又不是固定大小的,
所以如果要利用这些空间空间,就需要遍历一遍数据页面来找到它们,这样会造成比较大的开销。
因此就设计了用来记录每个页面剩余空间的空闲空间映射表FSM(Free Space Mapping),
以便高效的将空闲空间管理起来,方便查找和重新使用。FSM在第一次vacuum之后会出现,
可以使用pg_freespacemap扩展查看。FSM的结构类似于一个Btree,
感兴趣的可以参阅:PgSQL · 原理介绍 · PostgreSQL中的空闲空间管理
极端情况下,就会发生每个页面的"填充率"特别低,如下每个页面总计三个元组,有效利用率特别低。
因此,需要引入vacuum full,vacuum full会对表进行重组,也就意味着表的oid会变,
所以不能我们在日常操作中,要定位表的oid的时候,不能通过pg_class的oid来找,
得通过pg_class的relfilenode来找,这样才精确。而且,vacuum full最大会占据原来磁盘空间的两倍,所以请预留好磁盘空间。
说到空间,这里说个小技巧,可以提前touch或者dd一个大文件,比如5GB,放在那里,
关键时刻删除它以释放空间,说不定能救你一命。
5.表膨胀常见原因
1.IO能力差
2.未开启autovacuum
3.autovacuum触发较迟
4.存在长事务
5.存在失效的事务槽
6.设置事务隔离级别:repeatable read 或 serializable 的隔离级别.
6.获取表膨胀的SQL
获取膨胀率
--获取膨胀率
----SQL I
SELECT
CURRENT_CATALOG AS datname,
nspname,
relname,
bs * tblpages AS size,
CASE WHEN tblpages - est_tblpages_ff > 0 THEN
(tblpages - est_tblpages_ff) / tblpages::float
ELSE
0
END AS ratio
FROM (
SELECT
ceil(reltuples / ((bs - page_hdr) * fillfactor / (tpl_size * 100))) + ceil(toasttuples / 4) AS est_tblpages_ff,
tblpages,
fillfactor,
bs,
tblid,
nspname,
relname,
is_na
FROM (
SELECT
(4 + tpl_hdr_size + tpl_data_size + (2 * ma) - CASE WHEN tpl_hdr_size % ma = 0 THEN
ma
ELSE
tpl_hdr_size % ma
END - CASE WHEN ceil(tpl_data_size)::int % ma = 0 THEN
ma
ELSE
ceil(tpl_data_size)::int % ma
END) AS tpl_size,
(heappages + toastpages) AS tblpages,
heappages,
toastpages,
reltuples,
toasttuples,
bs,
page_hdr,
tblid,
nspname,
relname,
fillfactor,
is_na
FROM (
SELECT
tbl.oid AS tblid,
ns.nspname,
tbl.relname,
tbl.reltuples,
tbl.relpages AS heappages,
coalesce(toast.relpages, 0) AS toastpages,
coalesce(toast.reltuples, 0) AS toasttuples,
coalesce(substring(array_to_string(tbl.reloptions, ' ')
FROM 'fillfactor=([0-9]+)')::smallint, 100) AS fillfactor,
current_setting('block_size')::numeric AS bs,
CASE WHEN version() ~ 'mingw32'
OR version() ~ '64-bit|x86_64|ppc64|ia64|amd64' THEN
8
ELSE
4
END AS ma,
24 AS page_hdr,
23 + CASE WHEN MAX(coalesce(s.null_frac, 0)) > 0 THEN
(7 + count(s.attname)) / 8
ELSE
0::int
END + CASE WHEN bool_or(att.attname = 'oid'
AND att.attnum < 0) THEN
4
ELSE
0
END AS tpl_hdr_size,
sum((1 - coalesce(s.null_frac, 0)) * coalesce(s.avg_width, 0)) AS tpl_data_size,
bool_or(att.atttypid = 'pg_catalog.name'::regtype)
OR sum(
CASE WHEN att.attnum > 0 THEN
1
ELSE
0
END) <> count(s.attname) AS is_na
FROM
pg_attribute AS att
JOIN pg_class AS tbl ON att.attrelid = tbl.oid
JOIN pg_namespace AS ns ON ns.oid = tbl.relnamespace
LEFT JOIN pg_stats AS s ON s.schemaname = ns.nspname
AND s.tablename = tbl.relname
AND s.inherited = FALSE
AND s.attname = att.attname
LEFT JOIN pg_class AS toast ON tbl.reltoastrelid = toast.oid
WHERE
NOT att.attisdropped
AND tbl.relkind = 'r'
AND nspname NOT IN ('pg_catalog', 'information_schema')
GROUP BY
1,
2,
3,
4,
5,
6,
7,
8,
9,
10) AS s) AS s2) AS s3
WHERE
NOT is_na;
-------------------------------------SQL II
-- new table bloat query
-- still needs work; is often off by +/- 20%
WITH constants AS (
-- define some constants for sizes of things
-- for reference down the query and easy maintenance
SELECT
current_setting('block_size')::numeric AS bs,
23 AS hdr,
8 AS ma
),
no_stats AS (
-- screen out table who have attributes
-- which dont have stats, such as JSON
SELECT
table_schema,
table_name,
n_live_tup::numeric AS est_rows,
pg_table_size(relid)::numeric AS table_size
FROM
information_schema.columns
JOIN pg_stat_user_tables AS psut ON table_schema = psut.schemaname
AND table_name = psut.relname
LEFT OUTER JOIN pg_stats ON table_schema = pg_stats.schemaname
AND table_name = pg_stats.tablename
AND column_name = attname
WHERE
attname IS NULL
AND table_schema NOT IN ('pg_catalog', 'information_schema')
GROUP BY
table_schema,
table_name,
relid,
n_live_tup
),
null_headers AS (
-- calculate null header sizes
-- omitting tables which dont have complete stats
-- and attributes which aren't visible
SELECT
hdr + 1 + (sum(
CASE WHEN null_frac <> 0 THEN
1
ELSE
0
END) / 8) AS nullhdr,
SUM((1 - null_frac) * avg_width) AS datawidth,
MAX(null_frac) AS maxfracsum,
schemaname,
tablename,
hdr,
ma,
bs
FROM
pg_stats
CROSS JOIN constants
LEFT OUTER JOIN no_stats ON schemaname = no_stats.table_schema
AND tablename = no_stats.table_name
WHERE
schemaname NOT IN ('pg_catalog', 'information_schema')
AND no_stats.table_name IS NULL
AND EXISTS (
SELECT
1
FROM
information_schema.columns
WHERE
schemaname = columns.table_schema
AND tablename = columns.table_name)
GROUP BY
schemaname,
tablename,
hdr,
ma,
bs
),
data_headers AS (
-- estimate header and row size
SELECT
ma,
bs,
hdr,
schemaname,
tablename,
(datawidth + (hdr + ma - (
CASE WHEN hdr % ma = 0 THEN
ma
ELSE
hdr % ma
END)))::numeric AS datahdr,
(maxfracsum * (nullhdr + ma - (
CASE WHEN nullhdr % ma = 0 THEN
ma
ELSE
nullhdr % ma
END))) AS nullhdr2
FROM
null_headers
),
table_estimates AS (
-- make estimates of how large the table should be
-- based on row and page size
SELECT
schemaname,
tablename,
bs,
reltuples::numeric AS est_rows,
relpages * bs AS table_bytes,
CEIL((reltuples * (datahdr + nullhdr2 + 4 + ma - (
CASE WHEN datahdr % ma = 0 THEN
ma
ELSE
datahdr % ma
END)) / (bs - 20))) * bs AS expected_bytes,
reltoastrelid
FROM
data_headers
JOIN pg_class ON tablename = relname
JOIN pg_namespace ON relnamespace = pg_namespace.oid
AND schemaname = nspname
WHERE
pg_class.relkind = 'r'
),
estimates_with_toast AS (
-- add in estimated TOAST table sizes
-- estimate based on 4 toast tuples per page because we dont have
-- anything better. also append the no_data tables
SELECT
schemaname,
tablename,
TRUE AS can_estimate,
est_rows,
table_bytes + (coalesce(toast.relpages, 0) * bs) AS table_bytes,
expected_bytes + (ceil(coalesce(toast.reltuples, 0) / 4) * bs) AS expected_bytes
FROM
table_estimates
LEFT OUTER JOIN pg_class AS toast ON table_estimates.reltoastrelid = toast.oid
AND toast.relkind = 't'
),
table_estimates_plus AS (
-- add some extra metadata to the table data
-- and calculations to be reused
-- including whether we cant estimate it
-- or whether we think it might be compressed
SELECT
current_database() AS databasename,
schemaname,
tablename,
can_estimate,
est_rows,
CASE WHEN table_bytes > 0 THEN
table_bytes::numeric
ELSE
NULL::numeric
END AS table_bytes,
CASE WHEN expected_bytes > 0 THEN
expected_bytes::numeric
ELSE
NULL::numeric
END AS expected_bytes,
CASE WHEN expected_bytes > 0
AND table_bytes > 0
AND expected_bytes <= table_bytes THEN
(table_bytes - expected_bytes)::numeric
ELSE
0::numeric
END AS bloat_bytes
FROM
estimates_with_toast
UNION ALL
SELECT
current_database() AS databasename,
table_schema,
table_name,
FALSE,
est_rows,
table_size,
NULL::numeric,
NULL::numeric
FROM
no_stats
),
bloat_data AS (
-- do final math calculations and formatting
SELECT
current_database() AS databasename,
schemaname,
tablename,
can_estimate,
table_bytes,
round(table_bytes / (1024 ^ 2)::numeric, 3) AS table_mb,
expected_bytes,
round(expected_bytes / (1024 ^ 2)::numeric, 3) AS expected_mb,
round(bloat_bytes * 100 / table_bytes) AS pct_bloat,
round(bloat_bytes / (1024::numeric ^ 2), 2) AS mb_bloat,
table_bytes,
expected_bytes,
est_rows
FROM
table_estimates_plus)
-- filter output for bloated tables
SELECT
databasename,
schemaname,
tablename,
can_estimate,
est_rows,
pct_bloat,
mb_bloat,
table_mb
FROM
bloat_data
-- this where clause defines which tables actually appear
-- in the bloat chart
-- example below filters for tables which are either 50%
-- bloated and more than 20mb in size, or more than 25%
-- bloated and more than 4GB in size
WHERE (pct_bloat >= 50
AND mb_bloat >= 10)
OR (pct_bloat >= 25
AND mb_bloat >= 1000)
ORDER BY
pct_bloat DESC;
(2)获取膨胀情况
--获取膨胀情况
--SQL I
SELECT
current_database(),
schemaname,
tablename,
/*reltuples::bigint, relpages::bigint, otta,*/
ROUND((
CASE WHEN otta = 0 THEN
0.0
ELSE
sml.relpages::float / otta
END)::numeric, 1) AS tbloat,
CASE WHEN relpages < otta THEN
0
ELSE
bs * (sml.relpages - otta)::bigint
END AS wastedbytes,
iname,
/*ituples::bigint, ipages::bigint, iotta,*/
ROUND((
CASE WHEN iotta = 0
OR ipages = 0 THEN
0.0
ELSE
ipages::float / iotta
END)::numeric, 1) AS ibloat,
CASE WHEN ipages < iotta THEN
0
ELSE
bs * (ipages - iotta)
END AS wastedibytes
FROM (
SELECT
schemaname,
tablename,
cc.reltuples,
cc.relpages,
bs,
CEIL((cc.reltuples * ((datahdr + ma - (
CASE WHEN datahdr % ma = 0 THEN
ma
ELSE
datahdr % ma
END)) + nullhdr2 + 4)) / (bs - 20::float)) AS otta,
COALESCE(c2.relname, '?') AS iname,
COALESCE(c2.reltuples, 0) AS ituples,
COALESCE(c2.relpages, 0) AS ipages,
COALESCE(CEIL((c2.reltuples * (datahdr - 12)) / (bs - 20::float)), 0) AS iotta -- very rough approximation, assumes all cols
FROM (
SELECT
ma,
bs,
schemaname,
tablename,
(datawidth + (hdr + ma - (
CASE WHEN hdr % ma = 0 THEN
ma
ELSE
hdr % ma
END)))::numeric AS datahdr,
(maxfracsum * (nullhdr + ma - (
CASE WHEN nullhdr % ma = 0 THEN
ma
ELSE
nullhdr % ma
END))) AS nullhdr2
FROM (
SELECT
schemaname,
tablename,
hdr,
ma,
bs,
SUM((1 - null_frac) * avg_width) AS datawidth,
MAX(null_frac) AS maxfracsum,
hdr + (
SELECT
1 + count(*) / 8
FROM
pg_stats s2
WHERE
null_frac <> 0
AND s2.schemaname = s.schemaname
AND s2.tablename = s.tablename) AS nullhdr
FROM
pg_stats s,
(
SELECT
(
SELECT
current_setting('block_size')::numeric) AS bs,
CASE WHEN substring(v, 12, 3) IN ('8.0', '8.1', '8.2') THEN
27
ELSE
23
END AS hdr,
CASE WHEN v ~ 'mingw32' THEN
8
ELSE
4
END AS ma
FROM (
SELECT
version() AS v) AS foo) AS constants
GROUP BY
1,
2,
3,
4,
5) AS foo) AS rs
JOIN pg_class cc ON cc.relname = rs.tablename
JOIN pg_namespace nn ON cc.relnamespace = nn.oid
AND nn.nspname = rs.schemaname
AND nn.nspname <> 'information_schema'
LEFT JOIN pg_index i ON indrelid = cc.oid
LEFT JOIN pg_class c2 ON c2.oid = i.indexrelid) AS sml
ORDER BY
wastedbytes DESC;
(3)获取索引膨胀:管理员用户
--获取索引膨胀
--SQL I
-- btree index stats query
-- estimates bloat for btree indexes
WITH btree_index_atts AS (
SELECT
nspname,
indexclass.relname AS index_name,
indexclass.reltuples,
indexclass.relpages,
indrelid,
indexrelid,
indexclass.relam,
tableclass.relname AS tablename,
regexp_split_to_table(indkey::text, ' ')::smallint AS attnum,
indexrelid AS index_oid
FROM
pg_index
JOIN pg_class AS indexclass ON pg_index.indexrelid = indexclass.oid
JOIN pg_class AS tableclass ON pg_index.indrelid = tableclass.oid
JOIN pg_namespace ON pg_namespace.oid = indexclass.relnamespace
JOIN pg_am ON indexclass.relam = pg_am.oid
WHERE
pg_am.amname = 'btree'
AND indexclass.relpages > 0
AND nspname NOT IN ('pg_catalog', 'information_schema')
),
index_item_sizes AS (
SELECT
ind_atts.nspname,
ind_atts.index_name,
ind_atts.reltuples,
ind_atts.relpages,
ind_atts.relam,
indrelid AS table_oid,
index_oid,
current_setting('block_size')::numeric AS bs,
8 AS maxalign,
24 AS pagehdr,
CASE WHEN max(coalesce(pg_stats.null_frac, 0)) = 0 THEN
2
ELSE
6
END AS index_tuple_hdr,
sum((1 - coalesce(pg_stats.null_frac, 0)) * coalesce(pg_stats.avg_width, 1024)) AS nulldatawidth
FROM
pg_attribute
JOIN btree_index_atts AS ind_atts ON pg_attribute.attrelid = ind_atts.indexrelid
AND pg_attribute.attnum = ind_atts.attnum
JOIN pg_stats ON pg_stats.schemaname = ind_atts.nspname
-- stats for regular index columns
AND ((pg_stats.tablename = ind_atts.tablename
AND pg_stats.attname = pg_catalog.pg_get_indexdef(pg_attribute.attrelid, pg_attribute.attnum, TRUE))
-- stats for functional indexes
OR (pg_stats.tablename = ind_atts.index_name
AND pg_stats.attname = pg_attribute.attname))
WHERE
pg_attribute.attnum > 0
GROUP BY
1,
2,
3,
4,
5,
6,
7,
8,
9
),
index_aligned_est AS (
SELECT
maxalign,
bs,
nspname,
index_name,
reltuples,
relpages,
relam,
table_oid,
index_oid,
coalesce(ceil(reltuples * (6 + maxalign - CASE WHEN index_tuple_hdr % maxalign = 0 THEN
maxalign
ELSE
index_tuple_hdr % maxalign
END + nulldatawidth + maxalign - CASE /* Add padding to the data to align on MAXALIGN */
WHEN nulldatawidth::integer % maxalign = 0 THEN
maxalign
ELSE
nulldatawidth::integer % maxalign
END)::numeric / (bs - pagehdr::numeric) + 1), 0) AS expected
FROM
index_item_sizes
),
raw_bloat AS (
SELECT
current_database() AS dbname,
nspname,
pg_class.relname AS table_name,
index_name,
bs * (index_aligned_est.relpages)::bigint AS totalbytes,
expected,
CASE WHEN index_aligned_est.relpages <= expected THEN
0
ELSE
bs * (index_aligned_est.relpages - expected)::bigint
END AS wastedbytes,
CASE WHEN index_aligned_est.relpages <= expected THEN
0
ELSE
bs * (index_aligned_est.relpages - expected)::bigint * 100 / (bs * (index_aligned_est.relpages)::bigint)
END AS realbloat,
pg_relation_size(index_aligned_est.table_oid) AS table_bytes,
stat.idx_scan AS index_scans
FROM
index_aligned_est
JOIN pg_class ON pg_class.oid = index_aligned_est.table_oid
JOIN pg_stat_user_indexes AS stat ON index_aligned_est.index_oid = stat.indexrelid
),
format_bloat AS (
SELECT
dbname AS database_name,
nspname AS schema_name,
table_name,
index_name,
round(realbloat) AS bloat_pct,
round(wastedbytes / (1024 ^ 2)::numeric) AS bloat_mb,
round(totalbytes / (1024 ^ 2)::numeric, 3) AS index_mb,
round(table_bytes / (1024 ^ 2)::numeric, 3) AS table_mb,
index_scans
FROM
raw_bloat)
-- final query outputting the bloated indexes
-- change the where and order by to change
-- what shows up as bloated
SELECT
*
FROM
format_bloat
WHERE (bloat_pct > 50
AND bloat_mb > 10)
ORDER BY
bloat_mb DESC;
(4)普通用户获取索引膨胀率
--SQL II
--普通用户索引膨胀率查询
-- WARNING: executed with a non-superuser role, the query inspect only index on tables you are granted to read.
-- WARNING: rows with is_na = 't' are known to have bad statistics ("name" type is not supported).
-- This query is compatible with PostgreSQL 8.2 and after
SELECT
current_database(),
nspname AS schemaname,
tblname,
idxname,
bs * (relpages)::bigint AS real_size,
bs * (relpages - est_pages)::bigint AS extra_size,
100 * (relpages - est_pages)::float / relpages AS extra_pct,
fillfactor,
CASE WHEN relpages > est_pages_ff THEN
bs * (relpages - est_pages_ff)
ELSE
0
END AS bloat_size,
100 * (relpages - est_pages_ff)::float / relpages AS bloat_pct,
is_na
-- , 100-(pst).avg_leaf_density AS pst_avg_bloat, est_pages, index_tuple_hdr_bm, maxalign, pagehdr, nulldatawidth, nulldatahdrwidth, reltuples, relpages -- (DEBUG INFO)
FROM (
SELECT
coalesce(1 + ceil(reltuples / floor((bs - pageopqdata - pagehdr) / (4 + nulldatahdrwidth)::float)), 0 -- ItemIdData size + computed avg size of a tuple (nulldatahdrwidth)
) AS est_pages,
coalesce(1 + ceil(reltuples / floor((bs - pageopqdata - pagehdr) * fillfactor / (100 * (4 + nulldatahdrwidth)::float))), 0) AS est_pages_ff,
bs,
nspname,
tblname,
idxname,
relpages,
fillfactor,
is_na
-- , pgstatindex(idxoid) AS pst, index_tuple_hdr_bm, maxalign, pagehdr, nulldatawidth, nulldatahdrwidth, reltuples -- (DEBUG INFO)
FROM (
SELECT
maxalign,
bs,
nspname,
tblname,
idxname,
reltuples,
relpages,
idxoid,
fillfactor,
(index_tuple_hdr_bm + maxalign - CASE -- Add padding to the index tuple header to align on MAXALIGN
WHEN index_tuple_hdr_bm % maxalign = 0 THEN
maxalign
ELSE
index_tuple_hdr_bm % maxalign
END + nulldatawidth + maxalign - CASE -- Add padding to the data to align on MAXALIGN
WHEN nulldatawidth = 0 THEN
0
WHEN nulldatawidth::integer % maxalign = 0 THEN
maxalign
ELSE
nulldatawidth::integer % maxalign
END)::numeric AS nulldatahdrwidth,
pagehdr,
pageopqdata,
is_na
-- , index_tuple_hdr_bm, nulldatawidth -- (DEBUG INFO)
FROM (
SELECT
n.nspname,
i.tblname,
i.idxname,
i.reltuples,
i.relpages,
i.idxoid,
i.fillfactor,
current_setting('block_size')::numeric AS bs,
CASE -- MAXALIGN: 4 on 32bits, 8 on 64bits (and mingw32 ?)
WHEN version() ~ 'mingw32'
OR version() ~ '64-bit|x86_64|ppc64|ia64|amd64' THEN
8
ELSE
4
END AS maxalign,
/* per page header, fixed size: 20 for 7.X, 24 for others */
24 AS pagehdr,
/* per page btree opaque data */
16 AS pageopqdata,
/* per tuple header: add IndexAttributeBitMapData if some cols are null-able */
CASE WHEN max(coalesce(s.null_frac, 0)) = 0 THEN
2 -- IndexTupleData size
ELSE
2 + ((32 + 8 - 1) / 8) -- IndexTupleData size + IndexAttributeBitMapData size ( max num filed per index + 8 - 1 /8)
END AS index_tuple_hdr_bm,
/* data len: we remove null values save space using it fractionnal part from stats */
sum((1 - coalesce(s.null_frac, 0)) * coalesce(s.avg_width, 1024)) AS nulldatawidth,
max(
CASE WHEN i.atttypid = 'pg_catalog.name'::regtype THEN
1
ELSE
0
END) > 0 AS is_na
FROM (
SELECT
ct.relname AS tblname,
ct.relnamespace,
ic.idxname,
ic.attpos,
ic.indkey,
ic.indkey[ic.attpos],
ic.reltuples,
ic.relpages,
ic.tbloid,
ic.idxoid,
ic.fillfactor,
coalesce(a1.attnum, a2.attnum) AS attnum,
coalesce(a1.attname, a2.attname) AS attname,
coalesce(a1.atttypid, a2.atttypid) AS atttypid,
CASE WHEN a1.attnum IS NULL THEN
ic.idxname
ELSE
ct.relname
END AS attrelname
FROM (
SELECT
idxname,
reltuples,
relpages,
tbloid,
idxoid,
fillfactor,
indkey,
pg_catalog.generate_series(1, indnatts) AS attpos
FROM (
SELECT
ci.relname AS idxname,
ci.reltuples,
ci.relpages,
i.indrelid AS tbloid,
i.indexrelid AS idxoid,
coalesce(substring(array_to_string(ci.reloptions, ' ')
FROM 'fillfactor=([0-9]+)')::smallint, 90) AS fillfactor,
i.indnatts,
pg_catalog.string_to_array(pg_catalog.textin(pg_catalog.int2vectorout(i.indkey)), ' ')::int[] AS indkey
FROM
pg_catalog.pg_index i
JOIN pg_catalog.pg_class ci ON ci.oid = i.indexrelid
WHERE
ci.relam = (
SELECT
oid
FROM
pg_am
WHERE
amname = 'btree')
AND ci.relpages > 0) AS idx_data) AS ic
JOIN pg_catalog.pg_class ct ON ct.oid = ic.tbloid
LEFT JOIN pg_catalog.pg_attribute a1 ON ic.indkey[ic.attpos] <> 0
AND a1.attrelid = ic.tbloid
AND a1.attnum = ic.indkey[ic.attpos]
LEFT JOIN pg_catalog.pg_attribute a2 ON ic.indkey[ic.attpos] = 0
AND a2.attrelid = ic.idxoid
AND a2.attnum = ic.attpos) i
JOIN pg_catalog.pg_namespace n ON n.oid = i.relnamespace
JOIN pg_catalog.pg_stats s ON s.schemaname = n.nspname
AND s.tablename = i.attrelname
AND s.attname = i.attname
GROUP BY
1,
2,
3,
4,
5,
6,
7,
8,
9,
10,
11) AS rows_data_stats) AS rows_hdr_pdg_stats) AS relation_stats
ORDER BY
nspname,
tblname,
idxname;
7.如何消除表膨胀
vacuum full database_db1;
消除表的情况比较复杂。最主要是找到导致表膨胀的原因,这样才能更快的找到消除表膨胀的方法。