Postgresql源码(57)HOT更新为什么性能差距那么大?

2022-06-30 15:19:30 浏览数 (1)

  • BUFFER不变
  • hot_attrs_checked:PAGE没满
  • modified_attrs:有索引列被修改了

TPS

HOT

6035.041159

NOHOT

2447.563057

IO状况截取

代码语言:javascript复制
nohot

transaction type: ./test_nohot.sql
scaling factor: 1
query mode: prepared
number of clients: 32
number of threads: 32
duration: 120 s
number of transactions actually processed: 293701
latency average = 13.073 ms
latency stddev = 31.841 ms
initial connection time = 5.961 ms
tps = 2447.563057 (without initial connection time)
statement latencies in milliseconds:
         0.001  set id random(1,10000000)  
        13.064  update t0 set n1 = random()*100 where id = :id;
代码语言:javascript复制
hot

transaction type: ./test_hot.sql
scaling factor: 1
query mode: prepared
number of clients: 32
number of threads: 32
duration: 120 s
number of transactions actually processed: 724192
latency average = 5.302 ms
latency stddev = 14.019 ms
initial connection time = 6.242 ms
tps = 6035.041159 (without initial connection time)
statement latencies in milliseconds:
         0.001  set id random(1,10000000)  
         5.321  update t0 set n2 = random()*100 where id = :id;

测试CASE

代码语言:javascript复制
drop table t0;
create table t0(id int primary key, n1 int, n2 int);
create index idx_t0_id_n1 on t0(id, n1);
insert into t0 select generate_series(1,10000000);


vi test_nohot.sql  
set id random(1,10000000)  
update t0 set n1 = random()*100 where id = :id;

pgbench -M prepared -n -r -P 1 -f ./test_nohot.sql -c 32 -j 32 -T 120  


vi test_hot.sql  
set id random(1,10000000)  
update t0 set n2 = random()*100 where id = :id;

pgbench -M prepared -n -r -P 1 -f ./test_hot.sql -c 32 -j 32 -T 120  

2 update流程

从ExecUpdate函数进入

  1. 更新前执行:触发器、外键触发器、FDW等
  2. 检查新元组约束ExecConstraints
  3. 执行更新table_tuple_update
  4. 插入新索引元组ExecInsertIndexTuples(HOT更新不需要更新索引)
代码语言:javascript复制
ExecUpdate
  /* 更新前触发器、外键触发器、FDW等 */
  /* BEFORE ROW UPDATE Triggers */
  /* INSTEAD OF ROW UPDATE Triggers */
  /* update in foreign table: let the FDW do it */
  ...
  ExecConstraints
  table_tuple_update
  /* 可能走HOT不需要新索引项 */
  ExecInsertIndexTuples
  /* AFTER ROW UPDATE Triggers */
  /* Process RETURNING if present */
  ...
  return

2.1 update流程细节(HOT)

再展开上面流程

  1. 更新前:在ExecutorState内存中重新组装元组ExecMaterializeSlot
  2. 更新前:拿到所有相关索引ExecOpenIndices
  3. 更新前:执行:触发器、外键触发器、FDW等
  4. 检查:新元组约束ExecConstraints
  5. 执行更新:table_tuple_update
    1. 返回是否需要更新索引,如果是HOT则不需要更新索引
    2. heap_update执行具体更新流程(下面继续展开分析)
  6. 插入新索引元组ExecInsertIndexTuples(HOT不需要更新索引)
代码语言:javascript复制
ExecUpdate
  ExecMaterializeSlot
    tts_buffer_heap_materialize
      // 切换到ExecutorState内存上下文
      // 在新内存池中把元组拼出来
      heap_form_tuple
  /* 更新前触发器、外键触发器、FDW等 */
  ...
  /* 拿到相关索引oid list */
  ExecOpenIndices
  /* BEFORE ROW UPDATE Triggers */
  /* INSTEAD OF ROW UPDATE Triggers */
  /* update in foreign table: let the FDW do it */
  ...
  ExecConstraints
  table_tuple_update
    heapam_tuple_update
      // 执行更新
      heap_update
      // 是否需要更新索引?HOT?
      *update_indexes = result == TM_Ok && !HeapTupleIsHeapOnly(tuple)
  /* 可能走HOT不需要新索引项 */
  if (... && update_indexes)
    ExecInsertIndexTuples
      // 对每个相关索引都需要构造新的index tuple
      for (i = 0; i < numIndices; i  )
        FormIndexDatum
        index_insert
        // checkUnique
  /* AFTER ROW UPDATE Triggers */
  /* Process RETURNING if present */
  ...
  return

2.2 heap_update函数执行分析(HOT)

在展开上面流程中的heap_update函数:

执行:

代码语言:javascript复制
drop table t0;
create table t0(id int primary key, n1 int, n2 int);
create index idx_t0_id_n1 on t0(id, n1);
insert into t0 select generate_series(1,2), generate_series(1,2) 100, 888;
update t0 set n2 = 0 where id = 2;

第一步:bitmap配置,找到所有索引列的位置

这里的位图是什么参考这一篇:《Postgresql源码(52)bitmapset分析RelationGetIndexAttrBitmap》

生成三个位图记录索引位置:hot_attrs、key_attrs、id_attrs

代码语言:javascript复制
          n1  id
            /
            ||
hot_attrs = 1100000000  :哪些列上有索引?
key_attrs =  100000000  :哪些列上有唯一索引?
id_attrs  =  100000000  :复制标识

第二步:如果PAGE没满,可能走HOT,构造bitmap interesting_attrs

配置:hot_attrs_checked=true

代码语言:javascript复制
                  n1  id
                    /
                    ||
interesting_attrs = 1100000000

第三步:锁页面

第四步:哪些索引列被修改了?

HeapDetermineColumnsInfo函数构造modified_attrs位图,这里没有索引列被修改

代码语言:javascript复制
modified_attrs = 0000000000

这个位图记录的是有哪些索引列被修改了,注意是索引列

第五步:旧元组是否可见?

HeapTupleSatisfiesUpdate(MVCC)

第六步:决定是否走HOT

满足三个条件:

  • BUFFER不变
  • hot_attrs_checked:上面第二步:PAGE没满
  • modified_attrs:上面第四步:有索引列被修改了
代码语言:javascript复制
if (newbuf == buffer)
    if (hot_attrs_checked && !bms_overlap(modified_attrs, hot_attrs))
        use_hot_update = true;

第七步:开始HOT更新

  • 一、加标志位:
    • 老元组:HEAP_HOT_UPDATED、HEAP_ONLY_TUPLE
    • 新元组:HEAP_ONLY_TUPLE
  • 二、配置旧元组的头
  • 三、插入新元组
代码语言:javascript复制
if (use_hot_update) 
    HeapTupleSetHotUpdated(&oldtup)
        (tup)->t_infomask2 |= HEAP_HOT_UPDATED
    HeapTupleSetHeapOnly(heaptup)
         (tup)->t_infomask2 |= HEAP_ONLY_TUPLE
    HeapTupleSetHeapOnly(newtup)
        (tup)->t_infomask2 |= HEAP_ONLY_TUPLE

第八步:记录XLOG

log_heap_update

0 人点赞