Postgresql垃圾回收原理分析

2022-05-12 09:51:32 浏览数 (1)

1 引言:案例分析

1.1 慢查询案例

代码语言:javascript复制
tdb0529=> create table tbl01(id int, info text);
CREATE TABLE
tdb0529=> create index idx_tbl01_id on tbl01(id);
CREATE INDEX
tdb0529=> alter table tbl01 set (autovacuum_enabled =off);
ALTER TABLE
tdb0529=> insert into tbl01 select generate_series(1,1000000), md5(clock_timestamp()::text);
INSERT 0 1000000
tdb0529=> explain (analyze,verbose,timing,costs,buffers) select * from tbl01 where id=1;
                                                         QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------
 Index Scan using idx_tbl01_id on public.tbl01  (cost=0.42..8.44 rows=1 width=37) (actual time=0.024..0.024 rows=1 loops=1)
   Output: id, info
   Index Cond: (tbl01.id = 1)
   Buffers: shared hit=4
 Planning time: 0.363 ms
 Execution time: 0.054 ms
(6 rows)

扫描了4个数据块。另起事务插入大量数据块(不提交),查询发现扫描了大量数据块

代码语言:javascript复制
tdb0529=> begin;
BEGIN
tdb0529=> insert into tbl01 select 1, md5(clock_timestamp()::text) from generate_series(1,1000000);
INSERT 0 1000000
tdb0529=> abort;
ROLLBACK

tdb0529=> explain (analyze,verbose,timing,costs,buffers) select * from tbl01 where id=1;
                                                         QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on public.tbl01  (cost=4.38..8.40 rows=1 width=37) (actual time=48.241..133.774 rows=1 loops=1)
   Output: id, info
   Recheck Cond: (tbl01.id = 1)
   Heap Blocks: exact=8335
   Buffers: shared hit=6652 read=5161 dirtied=3636 written=557
   ->  Bitmap Index Scan on idx_tbl01_id  (cost=0.00..4.38 rows=1 width=0) (actual time=47.006..47.006 rows=1000001 loops=1)
         Index Cond: (tbl01.id = 1)
         Buffers: shared hit=1952 read=1526
 Planning time: 0.101 ms
 Execution time: 133.913 ms

事务提交、提交这些扫描都无法避免,原理请参考《3 并发控制》

回收必须使用vacuum,回收后,SQL执行时间恢复正常

代码语言:javascript复制
tdb0529=> vacuum verbose tbl01;
INFO:  vacuuming "public.tbl01"
INFO:  scanned index "idx_tbl01_id" to remove 1000000 row versions
DETAIL:  CPU: user: 0.24 s, system: 0.00 s, elapsed: 0.25 s
INFO:  "tbl01": removed 1000000 row versions in 8334 pages
DETAIL:  CPU: user: 0.02 s, system: 0.00 s, elapsed: 0.02 s
INFO:  index "idx_tbl01_id" now contains 1000000 row versions in 6237 pages
DETAIL:  1000000 index row versions were removed.
3474 index pages have been deleted, 0 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO:  "tbl01": found 1000000 removable, 1000000 nonremovable row versions in 16667 out of 16667 pages
DETAIL:  0 dead row versions cannot be removed yet, oldest xmin: 574
There were 0 unused item pointers.
Skipped 0 pages due to buffer pins, 0 frozen pages.
0 pages are entirely empty.
CPU: user: 0.45 s, system: 0.03 s, elapsed: 0.48 s.
INFO:  "tbl01": truncated 16667 to 8334 pages
DETAIL:  CPU: user: 0.00 s, system: 0.01 s, elapsed: 0.03 s
INFO:  vacuuming "pg_toast.pg_toast_16386"
INFO:  index "pg_toast_16386_index" now contains 0 row versions in 1 pages
DETAIL:  0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO:  "pg_toast_16386": found 0 removable, 0 nonremovable row versions in 0 out of 0 pages
DETAIL:  0 dead row versions cannot be removed yet, oldest xmin: 575
There were 0 unused item pointers.
Skipped 0 pages due to buffer pins, 0 frozen pages.
0 pages are entirely empty.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
VACUUM

tdb0529=> explain (analyze,verbose,timing,costs,buffers) select * from tbl01 where id=1;
                                                         QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------
 Index Scan using idx_tbl01_id on public.tbl01  (cost=0.42..8.44 rows=1 width=37) (actual time=0.012..0.012 rows=1 loops=1)
   Output: id, info
   Index Cond: (tbl01.id = 1)
   Buffers: shared hit=4
 Planning time: 0.135 ms
 Execution time: 0.031 ms

1.2 页面分析案例

代码语言:javascript复制
create table tbl02(id int, info text);
CREATE TABLE
alter table tbl02 set (autovacuum_enabled='off');
ALTER TABLE
-- superuser
create extension pageinspect;
CREATE EXTENSION

insert into tbl02 select id, repeat(md5(random()::text), 16) from generate_series(1,20) t(id);
INSERT 0 20

select pg_column_size(repeat(md5(random()::text), 16));
 pg_column_size
----------------
            516
  • lp指针分配到80
  • 数据的最低位点到576
  • 当前页面的free space=576-80=496
代码语言:javascript复制
SELECT * FROM page_header(get_raw_page('tbl02', 0));
    lsn     | checksum | flags | lower | upper | special | pagesize | version | prune_xid
------------ ---------- ------- ------- ------- --------- ---------- --------- -----------
 0/AC7E6400 |        0 |     0 |    80 |   576 |    8192 |     8192 |       4 |         0

查看具体某一条记录的偏移量

代码语言:javascript复制
select lp,lp_off from heap_page_items(get_raw_page('tbl02', 0));
 lp | lp_off
---- --------
  1 |   7648
  2 |   7104
  3 |   6560
  4 |   6016
  5 |   5472
  6 |   4928
  7 |   4384
  8 |   3840
  9 |   3296
 10 |   2752
 11 |   2208
 12 |   1664
 13 |   1120
 14 |    576
(14 rows)

查看靠近页面末尾的记录信息

代码语言:javascript复制
select * from heap_page_items(get_raw_page('tbl02', 0)) where lp=1;
-[ RECORD 1 ]--------------
lp          | 1
lp_off      | 7648
lp_flags    | 1
lp_len      | 544
t_xmin      | 579
t_xmax      | 0
t_field3    | 0
t_ctid      | (0,1)
t_infomask2 | 2
t_infomask  | 2050
t_hoff      | 24
t_bits      |
t_oid       |
t_data      | x01000000100800003435353366633031356233363938373961356537313761316461343337333634343535336663303135623336393837396135653731376131646134333733363434353533666330313562333639383739613565373137613164613433373336343435353366633031356233363938373961356537313761316461343337333634343535336663303135623336393837396135653731376131646134333733363434353533666330313562333639383739613565373137613164613433373336343435353366633031356233363938373961356537313761316461343337333634343535336663303135623336393837396135653731376131646134333733363434353533666330313562333639383739613565373137613164613433373336343435353366633031356233363938373961356537313761316461343337333634343535336663303135623336393837396135653731376131646134333733363434353533666330313562333639383739613565373137613164613433373336343435353366633031356233363938373961356537313761316461343337333634343535336663303135623336393837396135653731376131646134333733363434353533666330313562333639383739613565373137613164613433373336343435353366633031356233363938373961356537313761316461343337333634

间隔删除数据,使用ctid(页面号,lp号)作为条件,发现数据并没有真正的从页面中删除

代码语言:javascript复制
delete from tbl02 where ctid not in ('(0,1)','(0,3)','(0,5)','(0,7)','(0,9)','(0,11)','(0,13)');
DELETE 13

tdb0529=# select lp,lp_off,lp_flags,lp_len,t_xmin,t_xmax,t_field3,t_ctid,t_infomask2,t_infomask,t_hoff,t_bits,t_oid from heap_page_items(get_raw_page('tbl02', 0));
 lp | lp_off | lp_flags | lp_len | t_xmin | t_xmax | t_field3 | t_ctid | t_infomask2 | t_infomask | t_hoff | t_bits | t_oid
---- -------- ---------- -------- -------- -------- ---------- -------- ------------- ------------ -------- -------- -------
  1 |   7648 |        1 |    544 |    579 |      0 |        0 | (0,1)  |           2 |       2306 |     24 |        |
  2 |   7104 |        1 |    544 |    579 |    580 |        0 | (0,2)  |        8194 |        258 |     24 |        |
  3 |   6560 |        1 |    544 |    579 |      0 |        0 | (0,3)  |           2 |       2306 |     24 |        |
  4 |   6016 |        1 |    544 |    579 |    580 |        0 | (0,4)  |        8194 |        258 |     24 |        |
  5 |   5472 |        1 |    544 |    579 |      0 |        0 | (0,5)  |           2 |       2306 |     24 |        |
  6 |   4928 |        1 |    544 |    579 |    580 |        0 | (0,6)  |        8194 |        258 |     24 |        |
  7 |   4384 |        1 |    544 |    579 |      0 |        0 | (0,7)  |           2 |       2306 |     24 |        |
  8 |   3840 |        1 |    544 |    579 |    580 |        0 | (0,8)  |        8194 |        258 |     24 |        |
  9 |   3296 |        1 |    544 |    579 |      0 |        0 | (0,9)  |           2 |       2306 |     24 |        |
 10 |   2752 |        1 |    544 |    579 |    580 |        0 | (0,10) |        8194 |        258 |     24 |        |
 11 |   2208 |        1 |    544 |    579 |      0 |        0 | (0,11) |           2 |       2306 |     24 |        |
 12 |   1664 |        1 |    544 |    579 |    580 |        0 | (0,12) |        8194 |        258 |     24 |        |
 13 |   1120 |        1 |    544 |    579 |      0 |        0 | (0,13) |           2 |       2306 |     24 |        |
 14 |    576 |        1 |    544 |    579 |    580 |        0 | (0,14) |        8194 |        258 |     24 |        |

进行垃圾回收,发现lp并没有改变但是记录的空间已经被回收了

代码语言:javascript复制
tdb0529=# vacuum verbose tbl02;
INFO:  vacuuming "public.tbl02"
INFO:  "tbl02": removed 13 row versions in 2 pages
INFO:  "tbl02": found 13 removable, 7 nonremovable row versions in 2 out of 2 pages
DETAIL:  0 dead row versions cannot be removed yet, oldest xmin: 582
There were 0 unused item pointers.
Skipped 0 pages due to buffer pins, 0 frozen pages.
0 pages are entirely empty.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO:  "tbl02": truncated 2 to 1 pages
DETAIL:  CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s
INFO:  vacuuming "pg_toast.pg_toast_16413"
INFO:  index "pg_toast_16413_index" now contains 0 row versions in 1 pages
DETAIL:  0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO:  "pg_toast_16413": found 0 removable, 0 nonremovable row versions in 0 out of 0 pages
DETAIL:  0 dead row versions cannot be removed yet, oldest xmin: 583
There were 0 unused item pointers.
Skipped 0 pages due to buffer pins, 0 frozen pages.
0 pages are entirely empty.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
VACUUM

tdb0529=# select lp,lp_off,lp_flags,lp_len,t_xmin,t_xmax,t_field3,t_ctid,t_infomask2,t_infomask,t_hoff,t_bits,t_oid from heap_page_items(get_raw_page('tbl02', 0));
 lp | lp_off | lp_flags | lp_len | t_xmin | t_xmax | t_field3 | t_ctid | t_infomask2 | t_infomask | t_hoff | t_bits | t_oid
---- -------- ---------- -------- -------- -------- ---------- -------- ------------- ------------ -------- -------- -------
  1 |   7648 |        1 |    544 |    579 |      0 |        0 | (0,1)  |           2 |       2306 |     24 |        |
  2 |      0 |        0 |      0 |        |        |          |        |             |            |        |        |
  3 |   7104 |        1 |    544 |    579 |      0 |        0 | (0,3)  |           2 |       2306 |     24 |        |
  4 |      0 |        0 |      0 |        |        |          |        |             |            |        |        |
  5 |   6560 |        1 |    544 |    579 |      0 |        0 | (0,5)  |           2 |       2306 |     24 |        |
  6 |      0 |        0 |      0 |        |        |          |        |             |            |        |        |
  7 |   6016 |        1 |    544 |    579 |      0 |        0 | (0,7)  |           2 |       2306 |     24 |        |
  8 |      0 |        0 |      0 |        |        |          |        |             |            |        |        |
  9 |   5472 |        1 |    544 |    579 |      0 |        0 | (0,9)  |           2 |       2306 |     24 |        |
 10 |      0 |        0 |      0 |        |        |          |        |             |            |        |        |
 11 |   4928 |        1 |    544 |    579 |      0 |        0 | (0,11) |           2 |       2306 |     24 |        |
 12 |      0 |        0 |      0 |        |        |          |        |             |            |        |        |
 13 |   4384 |        1 |    544 |    579 |      0 |        0 | (0,13) |           2 |       2306 |     24 |        |
 14 |      0 |        0 |      0 |        |        |          |        |             |            |        |        |

vacuum full一下

代码语言:javascript复制
vacuum full tbl02;
VACUUM

select lp,lp_off,lp_flags,lp_len,t_xmin,t_xmax,t_field3,t_ctid,t_infomask2,t_infomask,t_hoff,t_bits,t_oid from heap_page_items(get_raw_page('tbl02', 0));
 lp | lp_off | lp_flags | lp_len | t_xmin | t_xmax | t_field3 | t_ctid | t_infomask2 | t_infomask | t_hoff | t_bits | t_oid
---- -------- ---------- -------- -------- -------- ---------- -------- ------------- ------------ -------- -------- -------
  1 |   7648 |        1 |    544 |    579 |      0 |        0 | (0,1)  |           2 |       2818 |     24 |        |
  2 |   7104 |        1 |    544 |    579 |      0 |        0 | (0,2)  |           2 |       2818 |     24 |        |
  3 |   6560 |        1 |    544 |    579 |      0 |        0 | (0,3)  |           2 |       2818 |     24 |        |
  4 |   6016 |        1 |    544 |    579 |      0 |        0 | (0,4)  |           2 |       2818 |     24 |        |
  5 |   5472 |        1 |    544 |    579 |      0 |        0 | (0,5)  |           2 |       2818 |     24 |        |
  6 |   4928 |        1 |    544 |    579 |      0 |        0 | (0,6)  |           2 |       2818 |     24 |        |
  7 |   4384 |        1 |    544 |    579 |      0 |        0 | (0,7)  |           2 |       2818 |     24 |        |
  • 这里可以得出结论,垃圾回收并不会改变lp,这样索引就不需要改变了
  • 垃圾回收的记录会有页内offset
  • vacuum full会改变lp,重新组织页面结构

2 使用vacuum

vacuum提供了两种使用接口

代码语言:javascript复制
vacuum
vacuum full

两种方式的加锁级别不同

代码语言:javascript复制
vacuum:
SHARE UPDATE EXCLUSIVE

vacuum full:
ACCESS EXCLUSIVE

为便于记忆总结一下vacuum full加排他锁,最高级别的锁和所有其他锁冲突。 而vacuum冲突SQL语句为:

代码语言:javascript复制
-- http://www.postgres.cn/docs/10/explicit-locking.html
VACUUM
ANALYZE
CREATE INDEX CONCURRENTLY
CREATE STATISTICS
ALTER TABLE
CREATE INDEX
REFRESH MATERIALIZED VIEW CONCURRENTLY
ALTER TABLE
DROP TABLE
TRUNCATE
REINDEX
CLUSTER
VACUUM FULL
REFRESH MATERIALIZED VIEW

vacuum的使用除了上述手动执行外,还提供了守护进程autovacuum自动化执行的方式。

2 清理tuple

后面的内容需要了解底层存储结构,请先阅读《1 存储结构》

清理步骤

代码语言:javascript复制
(1)  FOR each table
(2)       Acquire ShareUpdateExclusiveLock lock for the target table

          第一步:
(3)       扫描页面,冻结记录
(4)       删除垃圾记录的索引记录

          第二步:
(5)       FOR each page of the table
(6)            移除垃圾记录,重新组织页面结构
(7)            更新FSM和VM
           END FOR

          第三步:
(8)       Truncate the last page if possible
(9)       Update both the statistics and system catalogs of the target table
           Release ShareUpdateExclusiveLock lock
       END FOR

        /* Post-processing */
(10)  Update statistics and system catalogs
(11)  Remove both unnecessary files and pages of the clog if possible

2.1 第一步

首先,PostgreSQL扫描目标表建立垃圾元组列表,可能的话也会冻结过旧元组。 (冻结请直接到后面第6部分) 列表存储在内存中,由maintenance_work_mem控制(调大可以提高vacuum性能)。

maintenance_work_mem (integer)s Specifies the maximum amount of memory to be used by maintenance operations, such as VACUUM, CREATE INDEX, and ALTER TABLE ADD FOREIGN KEY. It defaults to 64 megabytes (64MB). Since only one of these operations can be executed at a time by a database session, and an installation normally doesn’t have many of them running concurrently, it’s safe to set this value significantly larger than work_mem. Larger settings might improve performance for vacuuming and for restoring database dumps. https://www.postgresql.org/docs/10/runtime-config-resource.html#RUNTIME-CONFIG-RESOURCE-MEMORY

扫描完成后,PostgreSQL通过垃圾元组列表来删除对应的索引元组。 当maintenance_work_mem已满时,PostgreSQL继续进行下一个任务,即(4)到(7); 然后返回到(3)继续扫描。

2.2 第二步

从上面的“1.2 页面分析案例”我们可以看到,记录被删除后,空间并不会立即释放,实际上删除的动作只是标记记录为垃圾记录,真正回收页面空间的还是由vacuum来做的。

上图中发生了四件事情:

  1. 删除语句删掉了记录tuple1和记录tuple3,页面中会标记为垃圾元组,注意这里的行指针并不会发生变化,如果行指针变了,对应的索引块也必须跟着更新。
  2. vacuum整理数据空间,避免出现碎片。
  3. vacuum清理垃圾元组对应的索引块中的记录。
  4. vacuum更新FSM和VM。

2.3 第三步

  1. 更新统计信息和相关的系统表、视图。
  2. 如果最后一页没有记录,直接删除最后一个页面。

2.4 最后

可能会触发清理clog,参考第五部分“5 清理CLOG”。

3 Visibility Map

前面我们了解到,MVCC机制下元组的更新和删除并不会立即从文件中做物理删除,而是通过事务ID标记,vacuum经过判断后删除。

可见性映射表的设计目的是为了加快vacuum的速度,原理比较简单:每个表文件都会有一个可见性映射表,里面保存了表文件中每个页面的可见性(有垃圾元组的页面用1来表示,没有的页面用0来表示)

代码语言:javascript复制
cd $PGDATA/base/13213
ls | grep 3601
3601
3601_fsm
3601_vm

4 Freeze

为什么要FREEZE?"并发控制"中我们做过初步的分析,这里我们给出更详细的解读。

首先我们来看一下PG中事务ID的比较逻辑:

代码语言:javascript复制
/*
 * TransactionIdPrecedes --- is id1 logically < id2?
 */
bool
TransactionIdPrecedes(TransactionId id1, TransactionId id2)
{
	/*
	 * If either ID is a permanent XID then we can just do unsigned
	 * comparison.  If both are normal, do a modulo-2^32 comparison.
	 */
	int32		diff;

	if (!TransactionIdIsNormal(id1) || !TransactionIdIsNormal(id2))
		return (id1 < id2);

	diff = (int32) (id1 - id2);
	return (diff < 0);
}

TransactionIdIsNormal宏的作用是判断id1是否大于等于3。这里值得注意的是diff = (int32) (id1 - id2)。这里使用了一个编程技巧,比如发生了事务ID回卷:

代码语言:javascript复制
id1 = 4294967200
id2 = 100

// int32    -2147483648 ~ 2147483647 (32亿 )
// uint32   0 ~ 4294967295 (42亿 )

id1 - id2 = 4294967100这个值强转成(int32)类型后是一个负数,会发生return true,函数会认为id1 < id2,也就是说id2是更新的事务。

我们继续考虑下一个场景,id2继续增长到21亿多时,id1和id2的差值已经在int32的范围内了,所以diff会大于零,return false函数会认为id1= 42亿 > id2= 21亿 ,结论是id1是更新的事务!这里就出现问题了,一个老事务被判断成了新事务。

所以PostgreSQL必须保证一个数据库中两个有效的事务之间的年龄差最多是

2^{31}

,约为20亿。

4.1 Lazy Freeze

关注参数:

代码语言:javascript复制
vacuum_freeze_min_age = 50000000(5千万)

lazy freeze是autovacuum和vacuum都会做的动作,所以vm在这里也是有效的,只有vm中标记的页面会做lazy freeze。

触发条件:

xmin小与freeze_txid的元组都会被freeze。

代码语言:javascript复制
freeze_txid = current_oldest_xmin - vacuum_freeze_min_age
            = 当前活跃的最小的xid - 参数vacuum_freeze_min_age

例如当前活跃的最小xid=50005100,vacuum_freeze_min_age为默认值50000000,那么freeze_txid=5100,也就是说xmin小于5100的元组都会被freeze(前提是所在的页面会被扫描到)

这里补充一个长事务的例子:

数据库运行一个长事务,很久没有提交导致current_oldest_xmin一直不会超过vacuum_freeze_min_age,vacuum不会冻结任何元组。这样最低的xmin就和当前最新的xmin的距离越来越远,差值慢慢接近20亿,这时候数据库为保证数据不丢失,会有告警甚至宕机。

告警

代码语言:javascript复制
 WARNING:  database "mydb" must be vacuumed within 177009986 transactions
HINT:  To avoid a database shutdown, execute a database-wide VACUUM in "mydb".

宕机

代码语言:javascript复制
ERROR:  database is not accepting commands to avoid wraparound data loss in database "mydb"
HINT:  Stop the postmaster and vacuum that database in single-user mode.

4.2 Eager Freeze

关注参数:

代码语言:javascript复制
vacuum_freeze_table_age = 150000000(1亿5千万)

触发条件:

代码语言:javascript复制
pg_database.datfrozenxid < (OldestXmin−vacuum_freeze_table_age)

pg_database.datfrozenxid : 当前数据库被冻结的最大的事务ID,pg_database中可以查到

例如我们查询到当前的pg_database.datfrozenxid和vacuum_freeze_table_age的值为

代码语言:javascript复制
select oid,datname,datfrozenxid from pg_database where oid=13214;
  oid  | datname  | datfrozenxid
------- ---------- --------------
 13214 | postgres |          548
 
show vacuum_freeze_table_age;
 vacuum_freeze_table_age
-------------------------
 150000000

这种情况下当OldestXmin > 150000000 = 548也就是当前最小的活跃事务大于1亿4千万 的时候才会触发eager freeze。

所以触发eager freeze的条件可以这样理解,当前最小的活跃事务ID 与 数据库最大的FREEZE事务ID的差值超过了1亿5千万(vacuum_freeze_table_age),就会触发eager freeze。

5 vacuum full

通过上面介绍可以发现vacuum存在的问题,vacuum只会做页面内的数据整理,表的物理空间实际上是没有被系统回收的。

vacuum full弥补了vacuum的一些不足:除了对页面进行整理外,还会对页面进行更复杂的跨页面整理。

5.1 vacuum full测试

一个简单的实验,观察一下vacuum前后和vacuum full前后表文件的大小:

代码语言:javascript复制
create table t1(id int, info text, crt_time timestamp);
insert into t1 select generate_series(1,2000000),md5(random()::text),clock_timestamp();
create table t2(id int, info text, crt_time timestamp);
insert into t2 select generate_series(1,2000000),md5(random()::text),clock_timestamp();

select pg_size_pretty(pg_relation_size('t1'));
 pg_size_pretty
----------------
 146 MB
select pg_size_pretty(pg_relation_size('t2'));
 pg_size_pretty
----------------
 146 MB

我们删除两张表中的部分数据

代码语言:javascript复制
delete from t1 where id%6!=0;
DELETE 1666667

delete from t2 where id%6!=0;
DELETE 1666667

然后分别进行vacuum 和 vauum full

代码语言:javascript复制
vacuum t1;

vacuum full t2;

表的大小变为:

代码语言:javascript复制
select pg_size_pretty(pg_relation_size('t1'));
 pg_size_pretty
----------------
 146 MB

select pg_size_pretty(pg_relation_size('t2'));
 pg_size_pretty
----------------
 24 MB

可以看到vacuum full真正的回收了表空间,这是因为页面中就算只有一条记录,vacuum都不会回收这个页面,只会做页面内的数据整理。而vacuum会跨页整理,把数据聚集起来,浪费的页面空间全部会归还给系统。

5.2 vacuum full使用注意

*注意1* vacuum full会请求表上的ACCESS EXCLUSIVE锁(排他锁),这把锁会和一切操作冲突,所以注意在有操作的表上不要执行。

*注意2* vacuum full执行是会使用额外的磁盘空间(最大约为当前表的大小),所以要注意执行前预留足够的磁盘空间。

5.3 vacuum full执行流程

  1. 获取排他锁
  2. 创建新表文件
  3. copy记录到新表,需要的话顺便freeze
  4. 删除旧表
  5. 重建索引
  6. 更新FSM和VM
  7. 更新统计信息
  8. 释放排他锁
  9. 释放无效clog

0 人点赞