接前文跟我一起读postgresql源码(九)——Executor(查询执行模块之——Scan节点(上)) ,本篇把剩下的七个Scan节点结束掉。
T_SubqueryScanState, T_FunctionScanState, T_ValuesScanState, T_CteScanState, T_WorkTableScanState, T_ForeignScanState, T_CustomScanState, 8.SubqueryScan 节点SubqueryScan节点的作用是以另一个査询计划树(子计划)为扫描对象进行元组的扫描,其扫描过程最终被转换为子计划的执行。
Postgres子查询主要包含如下几个关键字: EXISTS, IN, NOT IN, ANY/SOME, ALL,详细介绍可以看看:
举例子:
postgres=# explain select id from test_new where exists (select id from test_dm); QUERY PLAN ------------------------------------------------------------------------- Result (cost=0.02..35.52 rows=2550 width=4) One-Time Filter: $0 InitPlan 1 (returns $0) -> Seq Scan on test_dm (cost=0.00..22346.00 rows=1000000 width=0) -> Seq Scan on test_new (cost=0.00..35.50 rows=2550 width=4) (5 行)下面这个查询虽然也是子查询,但是在查询编译阶段被优化了(提升子连接,主要是把ANY和EXIST子句转换为半连接)
postgres=# explain select id from test_new where exists (select id from test_dm where id = test_new.id); QUERY PLAN ----------------------------------------------------------------------------- Hash Semi Join (cost=38753.00..42736.38 rows=1275 width=4) Hash Cond: (test_new.id = test_dm.id) -> Seq Scan on test_new (cost=0.00..35.50 rows=2550 width=4) -> Hash (cost=22346.00..22346.00 rows=1000000 width=4) -> Seq Scan on test_dm (cost=0.00..22346.00 rows=1000000 width=4) (5 行)有关内容,这里有一篇讲得很好:PostgreSQL查询优化之子查询优化
SubqueryScan节点在Scan节点之上扩展定义了子计划的根节点指针(subplan字段),而subrtable字段是査询编译器使用的结构,执行器运行时其值为空。
typedef struct SubqueryScan { Scan scan; Plan *subplan; } SubqueryScan;显然,SubqueryScan节点的初始化过程(ExecInitSubqueryScan函数)会使用ExecInitNode处理SubqueryScan的subplan字段指向的子计划树,并将子计划的PlanStale树根节点指针賦值给SubqueryScanState 的subplan字段。
typedef struct SubqueryScanState { ScanState ss; /* its first field is NodeTag */ PlanState *subplan; } SubqueryScanState;我认为SubqueryScan节点其实就是一个壳子,为什么这么说呢?因为SubqueryScan节点的执行(ExecSubqueryScan 函数)通过将SubqueryNext 传递给 ExecScan函数处理来实现的。SubqueryNext实际则是调用ExecProcNode处理subplan来获得元组。也就是说,这里SubqueryScan是运行了一个独立的查询计划,然后获取它的结果,而不是自己去扫描表。因此recheck工作就在独立的查询计划里做过了,SubqueryScan节点不必再做。
所以我们可以看到:
static bool SubqueryRecheck(SubqueryScanState *node, TupleTableSlot *slot) { /* nothing to check */ return true; }上面说了在执行时调用了ExecProcNode处理subplan,那么在清理过程中,很显然需要额外调用ExecEndNode来清理子计划。
9.FuncitonScan 节点二话不说先上例子:
postgres=# CREATE FUNCTION dup(int) RETURNS TABLE(f1 int, f2 text) postgres-# AS $$ SELECT $1, CAST($1 AS text) || ' is text' $$ postgres-# LANGUAGE SQL; CREATE FUNCTION postgres=# explain SELECT * FROM dup(42); QUERY PLAN ------------------------------------------------------------- Function Scan on dup (cost=0.25..10.25 rows=1000 width=36) (1 行)在PostgreSQL中,有一些函数可以返回元组的集合,为了能从这些函数的返回值中获取元组,PostgreSQL定义了 FunctionScan节点,其扫描对象为返回元组集的函数。FunctionScan节点在Scan的基础上扩展定义了:
functions列表字段,里面存放了FuncitonScan涉及的函数;
以及funcordinality字段(是否给返回结果加上序号列)。
When a function in the FROM clause is suffixed by WITH ORDINALITY, a bigint column is appended to the output which starts from 1 and increments by 1 for each row of the function's output. This is most useful in the case of set returning functions such as unnest()详细看这里:
typedef struct FunctionScan { Scan scan; List *functions; /* list of RangeTblFunction nodes */ bool funcordinality; /* WITH ORDINALITY */ } FunctionScan;FunctionScan 节点的初始化过程(ExecInitFunctionScan 函数)会初始化 FunctionScanState 结构,然后根据FunctionScan的字段functions,对每个函数构造运行时的状态节点FunctionScanPerFuncState,如下:
typedef struct FunctionScanPerFuncState { ExprState *funcexpr; /* state of the expression being evaluated */ TupleDesc tupdesc; /* desc of the function result type */ int colcount; /* expected number of result columns */ Tuplestorestate *tstore; /* holds the function result set */ int64 rowcount; /* # of rows in result set, -1 if not known */ TupleTableSlot *func_slot; /* function result slot (or NULL) */ } FunctionScanPerFuncState;