跟我一起读postgresql源码(十)——Executor(查询执行模块之——Scan节点(下))

日期：2022-04-07 栏目：程序人生浏览：次

接前文跟我一起读postgresql源码(九)——Executor(查询执行模块之——Scan节点(上)) ,本篇把剩下的七个Scan节点结束掉。

T_SubqueryScanState, T_FunctionScanState, T_ValuesScanState, T_CteScanState, T_WorkTableScanState, T_ForeignScanState, T_CustomScanState, 8.SubqueryScan 节点

SubqueryScan节点的作用是以另一个査询计划树（子计划）为扫描对象进行元组的扫描，其扫描过程最终被转换为子计划的执行。

Postgres子查询主要包含如下几个关键字: EXISTS, IN, NOT IN, ANY/SOME, ALL，详细介绍可以看看：

举例子：

postgres=# explain select id from test_new where exists (select id from test_dm); QUERY PLAN ------------------------------------------------------------------------- Result (cost=0.02..35.52 rows=2550 width=4) One-Time Filter: $0 InitPlan 1 (returns $0) -> Seq Scan on test_dm (cost=0.00..22346.00 rows=1000000 width=0) -> Seq Scan on test_new (cost=0.00..35.50 rows=2550 width=4) (5 行)

下面这个查询虽然也是子查询，但是在查询编译阶段被优化了(提升子连接，主要是把ANY和EXIST子句转换为半连接)

postgres=# explain select id from test_new where exists (select id from test_dm where id = test_new.id); QUERY PLAN ----------------------------------------------------------------------------- Hash Semi Join (cost=38753.00..42736.38 rows=1275 width=4) Hash Cond: (test_new.id = test_dm.id) -> Seq Scan on test_new (cost=0.00..35.50 rows=2550 width=4) -> Hash (cost=22346.00..22346.00 rows=1000000 width=4) -> Seq Scan on test_dm (cost=0.00..22346.00 rows=1000000 width=4) (5 行)

有关内容，这里有一篇讲得很好：PostgreSQL查询优化之子查询优化

SubqueryScan节点在Scan节点之上扩展定义了子计划的根节点指针（subplan字段），而subrtable字段是査询编译器使用的结构，执行器运行时其值为空。

typedef struct SubqueryScan { Scan scan; Plan *subplan; } SubqueryScan;

显然，SubqueryScan节点的初始化过程（ExecInitSubqueryScan函数）会使用ExecInitNode处理SubqueryScan的subplan字段指向的子计划树，并将子计划的PlanStale树根节点指针賦值给SubqueryScanState 的subplan字段。

typedef struct SubqueryScanState { ScanState ss; /* its first field is NodeTag */ PlanState *subplan; } SubqueryScanState;

我认为SubqueryScan节点其实就是一个壳子，为什么这么说呢？因为SubqueryScan节点的执行(ExecSubqueryScan 函数)通过将SubqueryNext 传递给 ExecScan函数处理来实现的。SubqueryNext实际则是调用ExecProcNode处理subplan来获得元组。也就是说，这里SubqueryScan是运行了一个独立的查询计划，然后获取它的结果，而不是自己去扫描表。因此recheck工作就在独立的查询计划里做过了，SubqueryScan节点不必再做。

所以我们可以看到：

static bool SubqueryRecheck(SubqueryScanState *node, TupleTableSlot *slot) { /* nothing to check */ return true; }

上面说了在执行时调用了ExecProcNode处理subplan，那么在清理过程中，很显然需要额外调用ExecEndNode来清理子计划。

9.FuncitonScan 节点

二话不说先上例子：

postgres=# CREATE FUNCTION dup(int) RETURNS TABLE(f1 int, f2 text) postgres-# AS $$ SELECT $1, CAST($1 AS text) || ' is text' $$ postgres-# LANGUAGE SQL; CREATE FUNCTION postgres=# explain SELECT * FROM dup(42); QUERY PLAN ------------------------------------------------------------- Function Scan on dup (cost=0.25..10.25 rows=1000 width=36) (1 行)

在PostgreSQL中，有一些函数可以返回元组的集合，为了能从这些函数的返回值中获取元组，PostgreSQL定义了 FunctionScan节点，其扫描对象为返回元组集的函数。FunctionScan节点在Scan的基础上扩展定义了:

functions列表字段，里面存放了FuncitonScan涉及的函数；

以及funcordinality字段(是否给返回结果加上序号列)。

When a function in the FROM clause is suffixed by WITH ORDINALITY, a bigint column is appended to the output which starts from 1 and increments by 1 for each row of the function's output. This is most useful in the case of set returning functions such as unnest()

详细看这里：

typedef struct FunctionScan { Scan scan; List *functions; /* list of RangeTblFunction nodes */ bool funcordinality; /* WITH ORDINALITY */ } FunctionScan;

FunctionScan 节点的初始化过程（ExecInitFunctionScan 函数）会初始化 FunctionScanState 结构，然后根据FunctionScan的字段functions，对每个函数构造运行时的状态节点FunctionScanPerFuncState，如下：

typedef struct FunctionScanPerFuncState { ExprState *funcexpr; /* state of the expression being evaluated */ TupleDesc tupdesc; /* desc of the function result type */ int colcount; /* expected number of result columns */ Tuplestorestate *tstore; /* holds the function result set */ int64 rowcount; /* # of rows in result set, -1 if not known */ TupleTableSlot *func_slot; /* function result slot (or NULL) */ } FunctionScanPerFuncState;

转载注明出处：https://www.heiqu.com/zzggxd.html

跟我一起读postgresql源码(十)——Executor(查询执行模块之——Scan节点(下))

相关推荐