Postgresql源码(63)查询执行——子模块Executor(1)

2022-07-14 13:49:59 浏览数 (1)

相关 《Postgresql源码(61)查询执行——最外层Portal模块》 《Postgresql源码(62)查询执行——子模块ProcessUtility》 《Postgresql源码(63)查询执行——子模块Executor(1)》

因为时间关系Executor分析拆成几篇,这是第一篇。

1 查询执行整体

PG中的SQL在经过语法解析、查询编译后,进入执行模块,整形模块的分三个子模块:

入口:portal子模块(下图蓝色)

处理DML的Executor子模块(下图绿色)

处理DDL的ProcessUtility子模块(下图橙色)

SQL会在查询编译阶段得到plantree_list,在portal模块启动时(函数PortalStart),根据plantree_list中具体情况(函数ChoosePortalStrategy),来决定PortalStrategy的值,后面执行根据PortalStrategy来决定进入Executor还是ProcessUtility。

本篇重点分析Executor子模块。

2 分析案例

测试SQL

代码语言:javascript复制
drop table course;drop table teacher;drop table teach_course;
create table course(no serial, name varchar, credit int,primary key(no));
insert into course(name, credit) values('Natural', 50);
insert into course(name, credit) values('Math', 30);
insert into course(name, credit) values('Database System', 20);

create table teacher(no serial, name varchar, sex char(1), age int);
insert into teacher(name, sex, age) values('Jack', 'm', 33);
insert into teacher(name, sex, age) values('Jennifer', 'f', 30);

create table teach_course(tno int, cno int, stu_num int);
insert into teach_course(tno, cno, stu_num) values(1, 2, 60);
insert into teach_course(tno, cno, stu_num) values(2, 3, 50);



select t.name, c.name, stu_num
from course as c, teach_course as tc, teacher as t
where c.no = tc.cno and tc.tno = t.no and c.name = 'Database System' and t.name = 'Jennifer';

执行计划

代码语言:javascript复制
explain select t.name, c.name, stu_num
from course as c, teach_course as tc, teacher as t
where c.no = tc.cno and tc.tno = t.no and c.name = 'Database System' and t.name = 'Jennifer';
                                    QUERY PLAN                                     
-----------------------------------------------------------------------------------
 Nested Loop  (cost=24.10..74.58 rows=1 width=68)
   ->  Hash Join  (cost=23.95..62.61 rows=61 width=40)
         Hash Cond: (tc.tno = t.no)
         ->  Seq Scan on teach_course tc  (cost=0.00..30.40 rows=2040 width=12)
         ->  Hash  (cost=23.88..23.88 rows=6 width=36)
               ->  Seq Scan on teacher t  (cost=0.00..23.88 rows=6 width=36)
                     Filter: ((name)::text = 'Jennifer'::text)
   ->  Index Scan using course_pkey on course c  (cost=0.15..0.20 rows=1 width=36)
         Index Cond: (no = tc.cno)
         Filter: ((name)::text = 'Database System'::text)

3 Portal驱动Executor

Executor执行由完成Portal模块拉动,核心函数:

  • ExecutorStart:初始化,构造运行状态存储结构estate
  • ExecutorRun:执行,调用ExecInitNode(函数)、ExecProcNode(函数指针)、ExecEndNode(函数)
  • ExecutorEnd:清理

需要的全部数据全部封装在QueryDesc中,包括计划树。

4 Executor驱动ExecProcNode

我们先看下执行计划和下图中的node树:

代码语言:javascript复制
                                    QUERY PLAN                                     
-----------------------------------------------------------------------------------
 Nested Loop  (cost=24.10..74.58 rows=1 width=68)
   ->  Hash Join  (cost=23.95..62.61 rows=61 width=40)
         Hash Cond: (tc.tno = t.no)
         ->  Seq Scan on teach_course tc  (cost=0.00..30.40 rows=2040 width=12)
         ->  Hash  (cost=23.88..23.88 rows=6 width=36)
               ->  Seq Scan on teacher t  (cost=0.00..23.88 rows=6 width=36)
                     Filter: ((name)::text = 'Jennifer'::text)
   ->  Index Scan using course_pkey on course c  (cost=0.15..0.20 rows=1 width=36)
         Index Cond: (no = tc.cno)
         Filter: ((name)::text = 'Database System'::text)
  • Executor执行时,在ExecutePlan内循环调用ExecProcNode,每一次拿出来一行元组,直到拿到所有所需元组位置。
  • ExecProcNode每次调用时:
    • 【1】首先拉动语法树根节点:例如上面例子中,会先执行ExecNestLoop,在执行ExecNestLoop时,该节点因为缺数据无法循环嵌套链接,所以肯定要拉动outter plan(hash join)和inner plan(index scan)把需要循环嵌套连接的两个数据拿回来,才能执行连接。
    • 【2】然后hash join在执行时,又会拉动seqscan节点去扫描拿到元组。
  • 所以就是这样由根节点驱动,逐层返回数据,最终拼出一条结果返回给ExecProcNode,ExecProcNode拿到一条结果后,ExecutePlan继续循环调用ExecutePlan拿到后面的结果。

5 总结

我们发现PG执行计划每个节点都是由两个子节点返回数据的(实际上计划树的每个node都是0-2进1出的结构)。在拿到一条执行计划后,直观上可以理解为上层节点(Nested Loop)首先执行,执行是通过自己的两个子节点(Hash Join、Index Scan)拿到数据,子节点又通过自己的子节点拿到数据。这样层层驱动整个计划树的运行。

代码语言:javascript复制
                                    QUERY PLAN                                     
-----------------------------------------------------------------------------------
 Nested Loop  (cost=24.10..74.58 rows=1 width=68)
   ->  Hash Join  (cost=23.95..62.61 rows=61 width=40)
         Hash Cond: (tc.tno = t.no)
         ->  Seq Scan on teach_course tc  (cost=0.00..30.40 rows=2040 width=12)
         ->  Hash  (cost=23.88..23.88 rows=6 width=36)
               ->  Seq Scan on teacher t  (cost=0.00..23.88 rows=6 width=36)
                     Filter: ((name)::text = 'Jennifer'::text)
   ->  Index Scan using course_pkey on course c  (cost=0.15..0.20 rows=1 width=36)
         Index Cond: (no = tc.cno)
         Filter: ((name)::text = 'Database System'::text)

0 人点赞