Postgresql源码(62)查询执行——子模块ProcessUtility

2022-07-14 13:51:12 浏览数 (1)

相关 《Postgresql源码(61)查询执行——最外层Portal模块》 《Postgresql源码(62)查询执行——子模块ProcessUtility》

接上篇 《Postgresql源码(61)查询执行——最外层Portal模块》

1 查询执行整体

PG中的SQL在经过语法解析、查询编译后,进入执行模块,整形模块的分三个子模块:

入口:portal子模块(下图蓝色)

处理DML的Executor子模块(下图绿色)

处理DDL的ProcessUtility子模块(下图橙色)

SQL会在查询编译阶段得到plantree_list,在portal模块启动时(函数PortalStart),根据plantree_list中具体情况(函数ChoosePortalStrategy),来决定PortalStrategy的值,后面执行根据PortalStrategy来决定进入Executor还是ProcessUtility。

本篇重点分析ProcessUtility子模块。

2 分析案例:create table

使用彭老师书中的CASE。

代码语言:javascript复制
create table course(
  no serial,
  name varchar,
  credit int,
  constraint con1 check(credit >=0 and name <> ''),
  primary key(no)
);

2.1 执行前的数据准备

按之前的分析,SQL执行时会先进入portal框架,进入前最主要的数据准备就是查询计划树。

我们看下create table的查询计划树长什么样子:

  1. plantree_list链表包含唯一节点utilityStmt
代码语言:javascript复制
plantree_list [List]
(node0) [PlannedStmt]
  { type = T_PlannedStmt, commandType = CMD_UTILITY, ... ,utilityStmt = 0x11bad68, stmt_location = 0, stmt_len = 135}
  1. utilityStmt保存了CreateStmt类型
代码语言:javascript复制
utilityStmt [CreateStmt]
  { type = T_CreateStmt, relation = 0x11ba028, tableElts = 0x11ba230, ...}
  1. CreateStmt类型中,记录了完整表名RangeVar
代码语言:javascript复制
relation [RangeVar]
  { type = T_RangeVar, catalogname = 0x0, schemaname = 0x0, relname = 0x11ba008 "course", inh = true, relpersistence = 112 'p', alias = 0x0, location = 13}
  1. CreateStmt类型中,记录了要创建的所有表项,包括三个列和两个约束
代码语言:javascript复制
tableElts [List]
(node0) [ColumnDef] 
  {type = T_ColumnDef, colname = 0x11ba080 "no", typeName = 0x11ba140, compression = 0x0, inhcount = 0, is_local = true, is_not_null = false, is_from_type = false, storage = 0 '00', raw_default = 0x0, cooked_default = 0x0, identity = 0 '00', identitySequence = 0x0, generated = 0 '00', collClause = 0x0, collOid = 0, constraints = 0x0, fdwoptions = 0x0, location = 23}
(node1) [ColumnDef] 
  {type = T_ColumnDef, colname = 0x11ba288 "name", typeName = 0x11ba350, compression = 0x0, inhcount = 0, is_local = true, is_not_null = false, is_from_type = false, storage = 0 '00', raw_default = 0x0, cooked_default = 0x0, identity = 0 '00', identitySequence = 0x0, generated = 0 '00', collClause = 0x0, collOid = 0, constraints = 0x0, fdwoptions = 0x0, location = 36}
(node2) [ColumnDef] 
  {type = T_ColumnDef, colname = 0x11ba440 "credit", typeName = 0x11ba508, compression = 0x0, inhcount = 0, is_local = true, is_not_null = false, is_from_type = false, storage = 0 '00', raw_default = 0x0, cooked_default = 0x0, identity = 0 '00', identitySequence = 0x0, generated = 0 '00', collClause = 0x0, collOid = 0, constraints = 0x0, fdwoptions = 0x0, location = 52}
(node3) [Constraint] 
  {type = T_Constraint, contype = CONSTR_CHECK, conname = 0x11ba5f8 "con1", deferrable = false, initdeferred = false, location = 66, is_no_inherit = false, raw_expr = 0x11baa60, cooked_expr = 0x0, generated_when = 0 '00', keys = 0x0, including = 0x0, exclusions = 0x0, options = 0x0, indexname = 0x0, indexspace = 0x0, reset_default_tblspc = false, access_method = 0x0, where_clause = 0x0, pktable = 0x0, fk_attrs = 0x0, pk_attrs = 0x0, fk_matchtype = 0 '00', fk_upd_action = 0 '00', fk_del_action = 0 '00', old_conpfeqop = 0x0, old_pktable_oid = 0, skip_validation = false, initially_valid = true}
(node4) [Constraint] 
  {type = T_Constraint, contype = CONSTR_PRIMARY, conname = 0x0, deferrable = false, initdeferred = false, location = 118, is_no_inherit = false, raw_expr = 0x0, cooked_expr = 0x0, generated_when = 0 '00', keys = 0x11babf8, including = 0x0, exclusions = 0x0, options = 0x0, indexname = 0x0, indexspace = 0x0, reset_default_tblspc = false, access_method = 0x0, where_clause = 0x0, pktable = 0x0, fk_attrs = 0x0, pk_attrs = 0x0, fk_matchtype = 0 '00', fk_upd_action = 0 '00', fk_del_action = 0 '00', old_conpfeqop = 0x0, old_pktable_oid = 0, skip_validation = false, initially_valid = false}
  • 可以看到对于DDL类型走ProcessUtility的执行计划树,一般都会把数据包装到PlannedStmt->utilityStmt(例如上面的例子中,utilityStmt就是一个CreateStmt),计划树的其他变量都是无效的。
  • 注意到CreateStmt类型中记录的表项tableElts,只有三个列和两个约束,那么两个约束这样的节点是无法被直接执行的(比如这里有一个主键约束,需要创建索引;还有一个序列类型,需要创建sequence),这样的plan在后续执行过程中,会有一个提前转换的过程,转换为可执行的计划。

2.2 执行过程

执行个过程总结

  1. 执行计划进入portal模块路由到ProcessUtilitySlow
  2. ProcessUtilitySlow执行建表计划,第一次执行时会有一次重要的转换,将(CreateStmt *) parsetree转换为stmts(例如上面测试SQL,这里会转化为四条语句:建序列、建表、建索引、alter序列)
  3. 四条语句在ProcessUtilitySlow的T_CreateStmt分支的循环中分别执行。

重点:transformCreateStmt一转四、建序列、建表、建索引、alter序列

代码语言:javascript复制
exec_simple_query
  |
  CreatePortal
  |
  PortalDefineQuery
  |
  PortalStart
  |
  PortalRun
  | |
  | PortalRunMulti
  |   |
  |   GetTransactionSnapshot
  |   ProcessUtility
  |     |
  |     standard_ProcessUtility
  |       |
  |       ProcessUtilitySlow
  |         |
  |         switch (nodeTag(parsetree))
  |           case T_CreateStmt:
  |             stmts = transformCreateStmt     // 重要:parsetree转化为执行链表stmts
  |             while (stmts != NIL)            // stmts存在四个节点
  |               (第一轮)ProcessUtility        // (第一轮)建序列
  |                   standard_ProcessUtility
  |                     ProcessUtilitySlow
  |                       DefineSequence
  |               (第二轮)DefineRelation        // (第二轮)建表  
  |               (第三轮)ProcessUtility        // (第三轮)建索引
  |                   standard_ProcessUtility
  |                     ProcessUtilitySlow
  |                       transformIndexStmt
  |                       DefineIndex
  |               (第四轮)ProcessUtility        // (第四轮)alter序列
  |                   standard_ProcessUtility
  |                     ProcessUtilitySlow
  |                       AlterSequence
  |
  PortalDrop

2.3 transformCreateStmt一转四

当前tableElts的状态,具体见2.1。

代码语言:javascript复制
tableElts [List]
(node0) [ColumnDef]  {type = T_ColumnDef, colname = 0x11ba080 "no" ...
(node1) [ColumnDef]  {type = T_ColumnDef, colname = 0x11ba288 "name", ...
(node2) [ColumnDef]  {type = T_ColumnDef, colname = 0x11ba440 "credit", ...
(node3) [Constraint] {type = T_Constraint, contype = CONSTR_CHECK, conname = 0x11ba5f8 "con1" ...
(node4) [Constraint] {type = T_Constraint, contype = CONSTR_PRIMARY, conname = 0x0, ...

执行时遍历tableElts每个节点分别转换

代码语言:javascript复制
transformCreateStmt
  foreach(elements, stmt->tableElts)
    switch (nodeTag(element))
      case T_ColumnDef
        transformColumnDefinition
          // 【1】如果列名是(smallserial、serial2)、(serial、serial4)、(bigserial、serial8)列名转换为INT2OID)、(INT4OID)、(INT8OID)
          // 【2】如果列名是上面任何一种,调用generateSerialExtraStmts生成新的stmt
      case T_Constraint:
        transformTableConstraint
          // 【1】约束配置到CreateStmtContext->ckconstraints
          // 【2】约束配置到CreateStmtContext->ixconstraints
  ...
  ...

输出

代码语言:javascript复制
(gdb) plist result
$126 = 4
$127 = {ptr_value = 0x12c3468, int_value = 19674216, oid_value = 19674216}
$128 = {ptr_value = 0x11bad68, int_value = 18591080, oid_value = 18591080}
$129 = {ptr_value = 0x134c340, int_value = 20235072, oid_value = 20235072}
$130 = {ptr_value = 0x12c3600, int_value = 19674624, oid_value = 19674624}

转换后:
$132 = {type = T_CreateSeqStmt, sequence = 0x134cc28, options = 0x12c3550, ownerId = 0, for_identity = false, if_not_exists = false}
$134 = {type = T_CreateStmt, relation = 0x11ba028, tableElts = 0x134cbd0, inhRelations = 0x0, partbound = 0x0, partspec = 0x0, ofTypename = 0x0, constraints = 0x12c1060, options = 0x0, oncommit = ONCOMMIT_NOOP, tablespacename = 0x0, accessMethod = 0x0, if_not_exists = false}
$146 = {type = T_IndexStmt, idxname = 0x0, relation = 0x11ba028, accessMethod = 0xc5461d "btree", tableSpace = 0x0,indexParams = 0x12c1188, indexIncludingParams = 0x0, options = 0x0, whereClause = 0x0, excludeOpNames = 0x0, idxcomment = 0x0,indexOid = 0, oldNode = 0, oldCreateSubid = 0, oldFirstRelfilenodeSubid = 0, unique = true, primary = true, isconstraint = true, deferrable = false, initdeferred = false, transformed = false, concurrent = false, if_not_exists = false,reset_default_tblspc = false}
$138 = {type = T_AlterSeqStmt, sequence = 0x12c3638, options = 0x12c3768, for_identity = false, missing_ok = false}

2.4 建序列

执行$132 = {type = T_CreateSeqStmt, sequence = 0x134cc28, options = 0x12c3550, ownerId = 0, for_identity = false, if_not_exists = false}

代码语言:javascript复制
DefineSequence
  // 创建序列表
  DefineRelation
  // 为序列表插入一条数据
  fill_seq_with_data
  // 插入pg_sequence系统表
  table_open(pg_sequence)
  // 完事

2.5 建表

执行$134 = {type = T_CreateStmt, relation = 0x11ba028, tableElts = 0x134cbd0, inhRelations = 0x0, partbound = 0x0, partspec = 0x0, ofTypename = 0x0, constraints = 0x12c1060, options = 0x0, oncommit = ONCOMMIT_NOOP, tablespacename = 0x0, accessMethod = 0x0, if_not_exists = false}

from《PostgreSQL数据库内核分析》

2.6 建索引

执行$146 = {type = T_IndexStmt, idxname = 0x0, relation = 0x11ba028, accessMethod = 0xc5461d "btree", tableSpace = 0x0,indexParams = 0x12c1188, indexIncludingParams = 0x0, options = 0x0, whereClause = 0x0, excludeOpNames = 0x0, idxcomment = 0x0,indexOid = 0, oldNode = 0, oldCreateSubid = 0, oldFirstRelfilenodeSubid = 0, unique = true, primary = true, isconstraint = true, deferrable = false, initdeferred = false, transformed = false, concurrent = false, if_not_exists = false,reset_default_tblspc = false}

代码语言:javascript复制
DefineIndex
  makeIndexInfo
  ComputeIndexAttrs
  index_check_primary_key
  index_create
    // 构造索引Descriptor
    ConstructTupleDescriptor
    // 建索引
    heap_create
    // 更新pg_class
    InsertPgClassTuple
    // 更新pg_index
    UpdateIndexRelation

2.7 alter序列

执行$138 = {type = T_AlterSeqStmt, sequence = 0x12c3638, options = 0x12c3768, for_identity = false, missing_ok = false}

代码语言:javascript复制
AlterSequence
  // 打开索引表
  table_open(SequenceRelationId...
  // 读取旧的
  read_seq_tuple
  // 拼新的
  // 写新的
  CatalogTupleUpdate
  // 完事
  table_close

0 人点赞