Postgresql源码(89)column definition list语义解析流程分析

2022-11-30 16:10:36 浏览数 (1)

0 总结

如果调用函数时同时满足以下几种情况

  1. 在from后面。
  2. 返回值为RECORD(或者是anyelement表示的RECORD)(anyelement的实际类型由入参决定,入参是RECORD,返回就是RECORD)。
  3. 返回值被判定为TYPEFUNC_RECORD(普通的RECORD,没有行描述符)。
  4. 函数后面不带列定义(column definition list)。

就会报错:ERROR: a column definition list is required for functions returning "record"

所以一个返回RECORD类型的函数:

  • 要么自己返回带格式的record(TYPEFUNC_COMPOSITE)
  • 要么在from后面加上列定义例如from func(1,2,3) as q(a int, b int)(as后面就是column definition list)

主流程总结?

代码语言:javascript复制
-- SQL1:报错
SELECT * FROM json_populate_record(
	null::record, 
	'{"name": "Tom", "age": 20, "hobbies": ["sports", "cars"], "address": {"country": "CN", "city": "BeiJing"}}');
	
-- SQL2:列定义从column definition list获取
SELECT * FROM json_populate_record(
	null::record, 
	'{"name": "Tom", "age": 20, "hobbies": ["sports", "cars"], "address": {"country": "CN", "city": "BeiJing"}}')
as q(name TEXT, age INT, hobbies TEXT[], address address);

-- SQL3:列定义从null::person获取
SELECT * FROM json_populate_record(
	null::person, 
	'{"name": "Tom", "age": 20, "hobbies": ["sports", "cars"], "address": {"country": "CN", "city": "BeiJing"}}');

语义分析transformRangeFunction中

  1. 对于函数表达式的解析transformExpr给出的结果中,可以发现SQL1、2的函数返回值是record、SQL3的返回值是person
  2. 继续构造行描述符TupleDesc,来源有两个地方:函数返回TupleDesc或SQL中有column definition list。SQL1都没有直接报错。
  3. SQL2的TupleDesc来自于column definition list;SQL3的来自于get_expr_result_type返回的tupdesc。

get_expr_result_type对于SQL2给出TYPEFUNC_RECORD的结果,表示缺失描述符 get_expr_result_type对于SQL3给出TYPEFUNC_COMPOSITE的结果,表示存在描述符,并返回tupdesc

get_expr_result_type是如何判断的?

1、基于transformExpr返回的FuncExpr里面存放的返回值类型。 2、FuncExpr里面存放的返回值类型的判断逻辑是:如果是多态函数(有anyelement的参数),那么anyelement传入的实际类型是什么,返回值就是什么。


1 案例

json_populate_record函数功能

json_populate_record ( base anyelement, from_json json ) → anyelement

  • 按base提供的record模式,from_json提供的数据,拼接元组并返回。(需要object形式的json;按key与列名匹配的规则填充数据)
  • 如果json中字段不全,使用base提供的数据填充。
代码语言:javascript复制
CREATE TYPE address as (country TEXT, city TEXT);
CREATE TYPE person as (name TEXT, age INT, hobbies TEXT[], address address);

场景一:base提供列定义、json提供全量数据

代码语言:javascript复制
SELECT * FROM json_populate_record(
	null::person, 
	'{"name": "Tom", "age": 20, "hobbies": ["sports", "cars"], "address": {"country": "CN", "city": "BeiJing"}}');

 name | age |    hobbies    |   address    
------ ----- --------------- --------------
 Tom  |  20 | {sports,cars} | (CN,BeiJing)

场景二:base提供列定义、json提供部分数据、base补全剩余数据

代码语言:javascript复制
SELECT * FROM json_populate_record(
	('x', 0, ARRAY['sports'], ('CN', 'BeiJing'))::person,
	'{"name": "Tom", "age": 20, "hobbies": ["sports", "cars"]}');

 name | age |    hobbies    |   address    
------ ----- --------------- --------------
 Tom  |  20 | {sports,cars} | (CN,BeiJing)

场景三:column definition list提供列定义、json提供全部数据

代码语言:javascript复制
SELECT * FROM json_populate_record(
	null::record, 
	'{"name": "Tom", "age": 20, "hobbies": ["sports", "cars"], "address": {"country": "CN", "city": "BeiJing"}}')
as q(name TEXT, age INT, hobbies TEXT[], address address);

 name | age |    hobbies    |   address    
------ ----- --------------- --------------
 Tom  |  20 | {sports,cars} | (CN,BeiJing)

场景四:column definition list提供列定义、base提供全部数据(非预期)

代码语言:javascript复制
SELECT * FROM json_populate_record(
	('x'::text, 0, ARRAY['sports'], ('CN', 'BeiJing')::address)::record, 
	'{"name": "Tom", "age": 20, "hobbies": ["sports", "cars"], "address": {"country": "CN", "city": "BeiJing"}}')
as q(name TEXT, age INT, hobbies TEXT[], address address);

 name | age | hobbies  |   address    
------ ----- ---------- --------------
 x    |   0 | {sports} | (CN,BeiJing)

2 column definition list是什么?

从上面案例场景一、三中可以发现,列定义有两种方式提供:

1、从函数参数来:null::person

代码语言:javascript复制
SELECT * FROM json_populate_record(
	null::person, 
	'{"name": "Tom", "age": 20, "hobbies": ["sports", "cars"], "address": {"country": "CN", "city": "BeiJing"}}');
	
-- 返回值类型:person
select pg_typeof(json_populate_record(null::person, '{"name": "Tom", "age": 20, "hobbies": ["sports", "cars"], "address": {"country": "CN", "city": "BeiJing"}}'));
 pg_typeof 
-----------
 person

2、从column definition list来:as q(name TEXT, age INT, hobbies TEXT[], address address)

代码语言:javascript复制
SELECT * FROM json_populate_record(
	null::record, 
	'{"name": "Tom", "age": 20, "hobbies": ["sports", "cars"], "address": {"country": "CN", "city": "BeiJing"}}')
as q(name TEXT, age INT, hobbies TEXT[], address address);

3 column definition list流程分析

下面对语义分析中,“from后面的函数” 的处理流程展开分析:transformRangeFunction

下面三个SQL执行进入transformRangeFunction时,参数有所区别:

代码语言:javascript复制
-- SQL1:报错
SELECT * FROM json_populate_record(
	null::record, 
	'{"name": "Tom", "age": 20, "hobbies": ["sports", "cars"], "address": {"country": "CN", "city": "BeiJing"}}');
	
-- SQL2:列定义从column definition list获取
SELECT * FROM json_populate_record(
	null::record, 
	'{"name": "Tom", "age": 20, "hobbies": ["sports", "cars"], "address": {"country": "CN", "city": "BeiJing"}}')
as q(name TEXT, age INT, hobbies TEXT[], address address);

-- SQL3:列定义从null::person获取
SELECT * FROM json_populate_record(
	null::person, 
	'{"name": "Tom", "age": 20, "hobbies": ["sports", "cars"], "address": {"country": "CN", "city": "BeiJing"}}');

入参:

可以看到SQL1(报错)和SQL3在入参中都没有coldeflist,但SQL1会报错,SQL3解析成功,原因需要分析下transformRangeFunction的逻辑。

3.1 transformRangeFunction:SQL123解析差异(主流程)

代码语言:javascript复制
static ParseNamespaceItem *
transformRangeFunction(ParseState *pstate, RangeFunction *r)
{
	...
	foreach(lc, r->functions)
	{
		List	   *pair = (List *) lfirst(lc);
		Node	   *fexpr;
		List	   *coldeflist;
		Node	   *newfexpr;
		Node	   *last_srf;

		/* Disassemble the function-call/column-def-list pairs */
		Assert(list_length(pair) == 2);
		fexpr = (Node *) linitial(pair);
		coldeflist = (List *) lsecond(pair);

		...// 处理unnest

		/* normal case ... */
		newfexpr = transformExpr(pstate, fexpr,
								 EXPR_KIND_FROM_FUNCTION);

注意:transformExpr出来的结果中SQL2与SQL3的funcresulttype不同!

transformExpr最后展开讲。

  • SQL1:{xpr = {type = T_FuncExpr}, funcid = 3960, funcresulttype = 2249, funcretset = false, funcvariadic = false, funcformat = COERCE_EXPLICIT_CALL, funccollid = 0, inputcollid = 0, args = 0x29f2aa0, location = 14}
  • SQL2:{xpr = {type = T_FuncExpr}, funcid = 3960, funcresulttype = 2249, funcretset = false, funcvariadic = false, funcformat = COERCE_EXPLICIT_CALL, funccollid = 0, inputcollid = 0, args = 0x29f31a8, location = 14}
  • SQL3:{xpr = {type = T_FuncExpr}, funcid = 3960, funcresulttype = 16424, funcretset = false, funcvariadic = false, funcformat = COERCE_EXPLICIT_CALL, funccollid = 0, inputcollid = 0, args = 0x29f2aa0, location = 14}
代码语言:javascript复制
		funcexprs = lappend(funcexprs, newfexpr);
		funcnames = lappend(funcnames,
							FigureColname(fexpr));
		coldeflists = lappend(coldeflists, coldeflist);
	}

coldeflists为空List进入addRangeTableEntryForFunction

代码语言:javascript复制
	return addRangeTableEntryForFunction(pstate,
										 funcnames, funcexprs, coldeflists,
										 r, is_lateral, true);
}

进入addRangeTableEntryForFunction,开始构造:RangeTblEntry

代码语言:javascript复制
ParseNamespaceItem *
addRangeTableEntryForFunction(ParseState *pstate,
							  List *funcnames,
							  List *funcexprs,
							  List *coldeflists,
							  RangeFunction *rangefunc,
							  bool lateral,
							  bool inFromCl)
{
	...

	rte->rtekind = RTE_FUNCTION;
	rte->relid = InvalidOid;
	rte->subquery = NULL;
	rte->functions = NIL;		/* we'll fill this list below */
	rte->funcordinality = rangefunc->ordinality;
	rte->alias = alias;

记录别名,只有SQL2的别名:q

代码语言:javascript复制
	if (alias)
		aliasname = alias->aliasname;
	else
		aliasname = linitial(funcnames);

	eref = makeAlias(aliasname, NIL);
	rte->eref = eref;

准备TupleDesc,元组描述符 就是 列定义。

代码语言:javascript复制
	/* Process each function ... */
	functupdescs = (TupleDesc *) palloc(nfuncs * sizeof(TupleDesc));

	totalatts = 0;
	funcno = 0;
	forthree(lc1, funcexprs, lc2, funcnames, lc3, coldeflists)
	{
		Node	   *funcexpr = (Node *) lfirst(lc1);
		char	   *funcname = (char *) lfirst(lc2);
		List	   *coldeflist = (List *) lfirst(lc3);
		RangeTblFunction *rtfunc = makeNode(RangeTblFunction);
		TypeFuncClass functypclass;
		Oid			funcrettype;

		/* Initialize RangeTblFunction node */
		rtfunc->funcexpr = funcexpr;
		rtfunc->funccolnames = NIL;
		rtfunc->funccoltypes = NIL;
		rtfunc->funccoltypmods = NIL;
		rtfunc->funccolcollations = NIL;
		rtfunc->funcparams = NULL;	/* not set until planning */

get_expr_result_type拿到的结果不同

  • SQL1:functypclass=TYPEFUNC_RECORD
  • SQL2:functypclass=TYPEFUNC_RECORD
  • SQL3:functypclass=TYPEFUNC_COMPOSITE
代码语言:javascript复制
		functypclass = get_expr_result_type(funcexpr,
											&funcrettype,
											&tupdesc);

		if (coldeflist != NIL)
		{
			...
		}
		else
		{

【SQL1】【SQL1】【SQL1】 SQL1只能到这里了,因为SQL1的类型为TYPEFUNC_RECORD、而且没有coldeflist。

代码语言:javascript复制
			if (functypclass == TYPEFUNC_RECORD)
				ereport(ERROR,
						(errcode(ERRCODE_SYNTAX_ERROR),
						 errmsg("a column definition list is required for functions returning "record""),
						 parser_errposition(pstate, exprLocation(funcexpr))));
		}

		if (functypclass == TYPEFUNC_COMPOSITE ||
			functypclass == TYPEFUNC_COMPOSITE_DOMAIN)
		{

【SQL3】【SQL3】【SQL3】 SQL3返回确定性的结果TYPEFUNC_COMPOSITE,进入这个分支。

TupleDesc由get_expr_result_type内部拼接好直接返回的,不用SQL2一样进入CreateTemplateTupleDesc拼接。

代码语言:javascript复制
			/* Composite data type, e.g. a table's row type */
			Assert(tupdesc);
		}
		else if (functypclass == TYPEFUNC_SCALAR)
		{
			/* Base data type, i.e. scalar */
			tupdesc = CreateTemplateTupleDesc(1);
			TupleDescInitEntry(tupdesc,
							   (AttrNumber) 1,
							   chooseScalarFunctionAlias(funcexpr, funcname,
														 alias, nfuncs),
							   funcrettype,
							   exprTypmod(funcexpr),
							   0);
			TupleDescInitEntryCollation(tupdesc,
										(AttrNumber) 1,
										exprCollation(funcexpr));
		}

【SQL2】【SQL2】【SQL2】 SQL2返回不确定record:TYPEFUNC_RECORD,from后带列定义,进入这个分支。

代码语言:javascript复制
		else if (functypclass == TYPEFUNC_RECORD)
		{
			ListCell   *col;

用列定义创建临时的元组描述符。

代码语言:javascript复制
			tupdesc = CreateTemplateTupleDesc(list_length(coldeflist));
			i = 1;
			foreach(col, coldeflist)
			{
				ColumnDef  *n = (ColumnDef *) lfirst(col);
				char	   *attrname;
				Oid			attrtype;
				int32		attrtypmod;
				Oid			attrcollation;

				attrname = n->colname;
				if (n->typeName->setof)
					ereport(ERROR,
							(errcode(ERRCODE_INVALID_TABLE_DEFINITION),
							 errmsg("column "%s" cannot be declared SETOF",
									attrname),
							 parser_errposition(pstate, n->location)));
				typenameTypeIdAndMod(pstate, n->typeName,
									 &attrtype, &attrtypmod);
				attrcollation = GetColumnDefCollation(pstate, n, attrtype);

初始化描述符的列属性。

代码语言:javascript复制
				TupleDescInitEntry(tupdesc,
								   (AttrNumber) i,
								   attrname,
								   attrtype,
								   attrtypmod,
								   0);
				TupleDescInitEntryCollation(tupdesc,
											(AttrNumber) i,
											attrcollation);
				rtfunc->funccolnames = lappend(rtfunc->funccolnames,
											   makeString(pstrdup(attrname)));
				rtfunc->funccoltypes = lappend_oid(rtfunc->funccoltypes,
												   attrtype);
				rtfunc->funccoltypmods = lappend_int(rtfunc->funccoltypmods,
													 attrtypmod);
				rtfunc->funccolcollations = lappend_oid(rtfunc->funccolcollations,
														attrcollation);

				i  ;
			}
			CheckAttributeNamesTypes(tupdesc, RELKIND_COMPOSITE_TYPE,
									 CHKATYPE_ANYRECORD);
		}

		/* Finish off the RangeTblFunction and add it to the RTE's list */
		rtfunc->funccolcount = tupdesc->natts;
		rte->functions = lappend(rte->functions, rtfunc);

		/* Save the tupdesc for use below */
		functupdescs[funcno] = tupdesc;
		totalatts  = tupdesc->natts;
		funcno  ;
	}
	...
	return buildNSItemFromTupleDesc(rte, list_length(pstate->p_rtable),
									tupdesc);
}

3.2 get_expr_result_type:如何判断函数返回值(分支流程)

代码语言:javascript复制
-- SQL2:列定义从column definition list获取
SELECT * FROM json_populate_record(
	null::record, 
	'{"name": "Tom", "age": 20, "hobbies": ["sports", "cars"], "address": {"country": "CN", "city": "BeiJing"}}')
as q(name TEXT, age INT, hobbies TEXT[], address address);

-- SQL3:列定义从null::person获取
SELECT * FROM json_populate_record(
	null::person, 
	'{"name": "Tom", "age": 20, "hobbies": ["sports", "cars"], "address": {"country": "CN", "city": "BeiJing"}}');

为什么get_expr_result_type拿到的结果不同?

  • SQL2:functypclass=TYPEFUNC_RECORD
  • SQL3:functypclass=TYPEFUNC_COMPOSITE

get_expr_result_type

代码语言:javascript复制
get_expr_result_type
  internal_get_result_type
    tp = SearchSysCache1(PROCOID, ObjectIdGetDatum(funcid))
    procform = (Form_pg_proc) GETSTRUCT(tp)
    rettype = procform->prorettype                         -- 【SQL2】【SQL3】2283:anyelement
    tupdesc = build_function_result_tupdesc_t              -- 【SQL2】【SQL3】返回类型不是RECORDOID就会返回NULL

	if (IsPolymorphicType(rettype))    -- 开始解析返回值
		Oid newrettype = exprType(call_expr)  -- 【SQL2】从expr->funcresulttype拿真正返回值2249:record
		                                      -- 【SQL3】从expr->funcresulttype拿真正返回值16424:person
		rettype = newrettype;

	if (resultTypeId)
		*resultTypeId = rettype;	  -- 2249
	if (resultTupleDesc)
		*resultTupleDesc = NULL;

	result = get_type_func_class(rettype, &base_rettype)
		switch get_typtype(typid)      // 【SQL2】找到2249的基础类型:typtype='p'=TYPTYPE_PSEUDO
		                               // 【SQL3】找到16424的基础类型:typtype='c'=TYPTYPE_COMPOSITE
			case TYPTYPE_PSEUDO:
				if (typid == RECORDOID)
					return TYPEFUNC_RECORD;  // 【SQL2】
			case TYPTYPE_COMPOSITE:
				return TYPEFUNC_COMPOSITE;   // 【SQL3】
	
	switch result 
		case TYPEFUNC_RECORD:                //【SQL2】
			break;
		case TYPEFUNC_COMPOSITE_DOMAIN:
			*resultTupleDesc = lookup_rowtype_tupdesc_copy(base_rettype, -1)  // 【SQL3】base_rettype=16424
	
	return result;  // 【SQL2】TYPEFUNC_RECORD
	                // 【SQL3】TYPTYPE_COMPOSITE

从上述分析中可以看出根因差别在FuncExpr的不同:

代码语言:javascript复制
SQL2
FuncExpr = {xpr = {type = T_FuncExpr}, funcid = 3960, funcresulttype = 2249, funcretset = false, funcvariadic = false, funcformat = COERCE_EXPLICIT_CALL, funccollid = 0, inputcollid = 0, args = 0x29f31a8, location = 14}

SQL3
FuncExpr = {xpr = {type = T_FuncExpr}, funcid = 3960, funcresulttype = 16424, funcretset = false, funcvariadic = false, funcformat = COERCE_EXPLICIT_CALL, funccollid = 0, inputcollid = 0, args = 0x29f2aa0, location = 14}

继续往前追溯FuncExpr是在哪拼出来的:transformExpr

  • func_get_detail:处理实际入参、处理默认参数;SQL2、3相同。
  • enforce_generic_type_consistency:
代码语言:javascript复制
transformExpr
  transformExprRecurse
    transformFuncCall
      ParseFuncOrColumn
        ...
		
		//【SQL3】{16424,705,16}     【SQL2】{2249, 705, 16}
		foreach(l, fargs)
			Node *arg = lfirst(l);
			Oid argtype = exprType(arg);
			actual_arg_types[nargs  ] = argtype;
		
        //【SQL2】fdresult = FUNCDETAIL_NORMAL  rettype = 2283(anyelement)
        //【SQL3】fdresult = FUNCDETAIL_NORMAL  rettype = 2283(anyelement)
        fdresult = func_get_detail(&rettype) // 返回rettype
  		
  		// enforce_generic_type_consistency
  		//   直接返回actual_arg_types多态参数位置(第一个参数)
        //【SQL2】返回值rettype = 2249(record)
        //【SQL3】返回值rettype = 16424(person)
        rettype = enforce_generic_type_consistency( // 入参
        	actual_arg_types,		//【SQL3】{16424,705,16}     【SQL2】{2249, 705, 16}
        	declared_arg_types,     //【SQL3】{2283,114,16}      【SQL2】{2283,114,16}
        	nargsplusdefs,          //【SQL3】3                  【SQL2】3
        	rettype,                //【SQL3】2283(anyelement)   【SQL2】2283(anyelement)
        	false)

0 人点赞