Postgresql源码(28)Btree索引分裂前后结构差异对比

2022-05-12 08:56:24 浏览数 (1)

阅读顺序

《Postgresql源码(30)Postgresql索引基础B-linked-tree》

《Postgresql源码(31)Btree索引相关系统表和整体结构》

《Postgresql源码(32)Btree索引分裂前后结构差异对比》

《Postgresql源码(33)Btree索引读——整体流程&_bt_first》

《Postgresql源码(34)Btree索引读——_bt_first搜索部分分析》

《Postgresql源码(36)Btree索引读——_bt_next搜索部分分析》

总结

分析流程在后面,这里总结便于查询

场景一:root分裂为branch的前后对比(level1–>2)

《level 1 到 2 的分裂过程》

场景二:root分裂为leaf的前后对比(level0–>1)

《level 0 到 1 的分裂过程》

  1. 分裂前后对比,root能保存407个元素,小于这个数字可以不需要leaf,root直接做leaf(level=0)
  2. 超出后root提升到level=1,两个leaf可以保存最大容量的90%(366个)。第一个leaf存到90%,第二个leaf保存剩下的10%和新值。

这里也解决了一个疑问,为什么新索引root节点经常在第三个页面:因为root永远在最后构造,第一次分裂leaf占据1、2页面,root就在第三位了。

场景三:leaf分裂前后对比(level1–>1)

leaf页面可以保存407个元素,分裂后保存90%的元素数量366,剩下的10%搬移到新leaf中。

代码语言:javascript复制
create table t0(id int primary key, info text);
insert into t0 select generate_series(1,10000), md5(random()::text);

postgres=# d t4
                 Table "public.t4"
 Column |  Type   | Collation | Nullable | Default 
-------- --------- ----------- ---------- ---------
 id     | integer |           | not null | 
 info   | text    |           |          | 
Indexes:
    "t4_pkey" PRIMARY KEY, btree (id)

10000条integer索引结构

代码语言:javascript复制
postgres=# insert into t4 select generate_series(1,10000), md5(random()::text);

(1)meta

level = 1 只有一层leaf

代码语言:javascript复制
postgres=# select * from bt_metap('t4_pkey');
 magic  | version | root | level | fastroot | fastlevel 
-------- --------- ------ ------- ---------- -----------
 340322 |       2 |    3 |     1 |        3 |         1

(2)root

btpo_flags = 2 说明是root btpo = 1 说明在第一层,0层是leaf

代码语言:javascript复制
postgres=# select * from bt_page_stats('t4_pkey',3);
 blkno | type | live_items | dead_items | avg_item_size | page_size | free_size | btpo_prev | btpo_next | btpo | btpo_flags 
------- ------ ------------ ------------ --------------- ----------- ----------- ----------- ----------- ------ ------------
     3 | r    |         28 |          0 |            15 |      8192 |      7596 |         0 |         0 |    1 |          2

root指向28个leaf页面

代码语言:javascript复制
postgres=# select * from bt_page_items('t4_pkey', 3);
 itemoffset |  ctid  | itemlen | nulls | vars |          data           
------------ -------- --------- ------- ------ -------------------------
          1 | (1,1)  |       8 | f     | f    | 
          2 | (2,1)  |      16 | f     | f    | 6f 01 00 00 00 00 00 00
          3 | (4,1)  |      16 | f     | f    | dd 02 00 00 00 00 00 00
          4 | (5,1)  |      16 | f     | f    | 4b 04 00 00 00 00 00 00
          5 | (6,1)  |      16 | f     | f    | b9 05 00 00 00 00 00 00
          6 | (7,1)  |      16 | f     | f    | 27 07 00 00 00 00 00 00
          7 | (8,1)  |      16 | f     | f    | 95 08 00 00 00 00 00 00
          8 | (9,1)  |      16 | f     | f    | 03 0a 00 00 00 00 00 00
          9 | (10,1) |      16 | f     | f    | 71 0b 00 00 00 00 00 00
         10 | (11,1) |      16 | f     | f    | df 0c 00 00 00 00 00 00
         11 | (12,1) |      16 | f     | f    | 4d 0e 00 00 00 00 00 00
         12 | (13,1) |      16 | f     | f    | bb 0f 00 00 00 00 00 00
         13 | (14,1) |      16 | f     | f    | 29 11 00 00 00 00 00 00
         14 | (15,1) |      16 | f     | f    | 97 12 00 00 00 00 00 00
         15 | (16,1) |      16 | f     | f    | 05 14 00 00 00 00 00 00
         16 | (17,1) |      16 | f     | f    | 73 15 00 00 00 00 00 00
         17 | (18,1) |      16 | f     | f    | e1 16 00 00 00 00 00 00
         18 | (19,1) |      16 | f     | f    | 4f 18 00 00 00 00 00 00
         19 | (20,1) |      16 | f     | f    | bd 19 00 00 00 00 00 00
         20 | (21,1) |      16 | f     | f    | 2b 1b 00 00 00 00 00 00
         21 | (22,1) |      16 | f     | f    | 99 1c 00 00 00 00 00 00
         22 | (23,1) |      16 | f     | f    | 07 1e 00 00 00 00 00 00
         23 | (24,1) |      16 | f     | f    | 75 1f 00 00 00 00 00 00
         24 | (25,1) |      16 | f     | f    | e3 20 00 00 00 00 00 00
         25 | (26,1) |      16 | f     | f    | 51 22 00 00 00 00 00 00
         26 | (27,1) |      16 | f     | f    | bf 23 00 00 00 00 00 00
         27 | (28,1) |      16 | f     | f    | 2d 25 00 00 00 00 00 00
         28 | (29,1) |      16 | f     | f    | 9b 26 00 00 00 00 00 00

(3)branch

(4)leaf

一个leaf页面存367-1=366条数据

代码语言:javascript复制
postgres=# select * from bt_page_items('t4_pkey', 1);
 itemoffset |  ctid   | itemlen | nulls | vars |          data           
------------ --------- --------- ------- ------ -------------------------
          1 | (3,7)   |      16 | f     | f    | 6f 01 00 00 00 00 00 00
          2 | (0,1)   |      16 | f     | f    | 01 00 00 00 00 00 00 00
          3 | (0,2)   |      16 | f     | f    | 02 00 00 00 00 00 00 00
          4 | (0,3)   |      16 | f     | f    | 03 00 00 00 00 00 00 00
          5 | (0,4)   |      16 | f     | f    | 04 00 00 00 00 00 00 00
          6 | (0,5)   |      16 | f     | f    | 05 00 00 00 00 00 00 00
          7 | (0,6)   |      16 | f     | f    | 06 00 00 00 00 00 00 00
          8 | (0,7)   |      16 | f     | f    | 07 00 00 00 00 00 00 00
          9 | (0,8)   |      16 | f     | f    | 08 00 00 00 00 00 00 00
...
        360 | (2,119) |      16 | f     | f    | 67 01 00 00 00 00 00 00
        361 | (2,120) |      16 | f     | f    | 68 01 00 00 00 00 00 00
        362 | (3,1)   |      16 | f     | f    | 69 01 00 00 00 00 00 00
        363 | (3,2)   |      16 | f     | f    | 6a 01 00 00 00 00 00 00
        364 | (3,3)   |      16 | f     | f    | 6b 01 00 00 00 00 00 00
        365 | (3,4)   |      16 | f     | f    | 6c 01 00 00 00 00 00 00
        366 | (3,5)   |      16 | f     | f    | 6d 01 00 00 00 00 00 00
        367 | (3,6)   |      16 | f     | f    | 6e 01 00 00 00 00 00 00

从root页面的数据来看,有28个leaf页面,前27个页面每个页面保存366条,第28个页面保存118行数据,正好10000条。

代码语言:javascript复制
27 x 366   118 = 10000

20000条integer索引结构

10000到20000页面结构的变化

代码语言:javascript复制
-- 10000条
root(3)
 |
leaf(1) | leaf(2) | leaf(4) | leaf(5) | leaf(6) | ... | leaf(29)

-- 20000条
root(412)                                   (btpo_flags=2)
 |
branch(3) |  beanch(411)                    (btpo_flags=0)
 |         
leaf(287) | leaf(1) | leaf(2) | ... | leaf(286) || leaf(287) | leaf(288) | ... | leaf(550)    (btpo_flags=1)

-- root分裂的两个branch元素数量:285个、262个

场景一:root分裂为branch的前后对比(level1–>2)

149370条记录时level从1到2,root分裂成branch。看下分裂前后树的情况对比。

代码语言:javascript复制
create table t7(id int primary key, info text);
truncate t7;
insert into t7 select generate_series(1,149369), md5(random()::text);

select * from bt_metap('t7_pkey');
 magic  | version | root | level | fastroot | fastlevel 
-------- --------- ------ ------- ---------- -----------
 340322 |       2 |    3 |     1 |        3 |         1


create table t8(id int primary key, info text);
truncate t8;
insert into t8 select generate_series(1,149370), md5(random()::text);
select * from bt_metap('t8_pkey');
 magic  | version | root | level | fastroot | fastlevel 
-------- --------- ------ ------- ---------- -----------
 340322 |       2 |  412 |     2 |      412 |         2

--------


postgres=# select * from bt_page_stats('t7_pkey',3);
 blkno | type | live_items | dead_items | avg_item_size | page_size | free_size | btpo_prev | btpo_next | btpo | btpo_flags 
------- ------ ------------ ------------ --------------- ----------- ----------- ----------- ----------- ------ ------------
     3 | r    |        408 |          0 |            15 |      8192 |         0 |         0 |         0 |    1 |          2
(1 row)


postgres=# select * from bt_page_stats('t8_pkey',412);
 blkno | type | live_items | dead_items | avg_item_size | page_size | free_size | btpo_prev | btpo_next | btpo | btpo_flags 
------- ------ ------------ ------------ --------------- ----------- ----------- ----------- ----------- ------ ------------
   412 | r    |          2 |          0 |            12 |      8192 |      8116 |         0 |         0 |    2 |          2


postgres=# select * from bt_page_items('t7_pkey', 3);
 itemoffset |  ctid   | itemlen | nulls | vars |          data           
------------ --------- --------- ------- ------ -------------------------
          1 | (1,1)   |       8 | f     | f    | 
          2 | (2,1)   |      16 | f     | f    | 6f 01 00 00 00 00 00 00
          3 | (4,1)   |      16 | f     | f    | dd 02 00 00 00 00 00 00
          4 | (5,1)   |      16 | f     | f    | 4b 04 00 00 00 00 00 00
          5 | (6,1)   |      16 | f     | f    | b9 05 00 00 00 00 00 00
          6 | (7,1)   |      16 | f     | f    | 27 07 00 00 00 00 00 00
          7 | (8,1)   |      16 | f     | f    | 95 08 00 00 00 00 00 00
          8 | (9,1)   |      16 | f     | f    | 03 0a 00 00 00 00 00 00
          9 | (10,1)  |      16 | f     | f    | 71 0b 00 00 00 00 00 00
         10 | (11,1)  |      16 | f     | f    | df 0c 00 00 00 00 00 00
...
        403 | (404,1) |      16 | f     | f    | bd 3e 02 00 00 00 00 00
        404 | (405,1) |      16 | f     | f    | 2b 40 02 00 00 00 00 00
        405 | (406,1) |      16 | f     | f    | 99 41 02 00 00 00 00 00
        406 | (407,1) |      16 | f     | f    | 07 43 02 00 00 00 00 00
        407 | (408,1) |      16 | f     | f    | 75 44 02 00 00 00 00 00
        408 | (409,1) |      16 | f     | f    | e3 45 02 00 00 00 00 00

postgres=# select * from bt_page_items('t8_pkey', 412);
 itemoffset |  ctid   | itemlen | nulls | vars |          data           
------------ --------- --------- ------- ------ -------------------------
          1 | (3,1)   |       8 | f     | f    | 
          2 | (411,1) |      16 | f     | f    | 77 97 01 00 00 00 00 00

结构差异

场景二:root分裂为leaf的前后对比(level0–>1)

代码语言:javascript复制
create table tx1(id int primary key, info text);
truncate tx1;
insert into tx1 select generate_series(1,407), md5(random()::text);
select * from bt_metap('tx1_pkey');
 magic  | version | root | level | fastroot | fastlevel 
-------- --------- ------ ------- ---------- -----------
 340322 |       2 |    1 |     0 |        1 |         0
 
select * from bt_page_items('tx1_pkey', 1);

create table tx2(id int primary key, info text);
truncate tx2;
insert into tx2 select generate_series(1,408), md5(random()::text);
select * from bt_metap('tx2_pkey');
 magic  | version | root | level | fastroot | fastlevel 
-------- --------- ------ ------- ---------- -----------
 340322 |       2 |    3 |     1 |        3 |         1
 
select * from bt_page_items('tx2_pkey', 3);

分裂前后对比,root能保存407个int元素,小于这个数字可以不需要leaf,root的就是level=0。
超出后root提升到level=1,两个leaf保存366个int元素。

这里也解决了一个疑问,为什么root节点经常在第三个页面。

场景三:leaf分裂前后对比(level1–>1)

代码语言:javascript复制
create table td1(id int primary key, info text);
truncate td1;
insert into td1 select generate_series(1,773), md5(random()::text);
select * from bt_page_items('td1_pkey', 3);

create table td2(id int primary key, info text);
truncate td2;
insert into td2 select generate_series(1,774), md5(random()::text);
select * from bt_page_items('td2_pkey', 3);

0 人点赞