sql: internal error when using UDFs with OUT parameters #120942

yuzefovich · 2024-03-23T20:27:13Z

This issue tracks addressing a few TODOs with commented out test cases where we currently encounter an internal error with UDFs and SPs with OUT parameters. For example,

CREATE FUNCTION f(INOUT param1 INT, OUT param2 INT) RETURNS RECORD AS $$
BEGIN
  param2 := 2;
  RAISE NOTICE '%', param2;
END
$$ LANGUAGE PLpgSQL;

SELECT * FROM f(3);

results in

NOTICE: 2
ERROR: internal error: invalid datum type given: RECORD, expected INT8
SQLSTATE: XX000
DETAIL: stack trace:
github.com/cockroachdb/cockroach/pkg/sql/rowenc/encoded_datum.go:195: DatumToEncDatum()
github.com/cockroachdb/cockroach/pkg/sql/rowexec/project_set.go:333: toEncDatum()
github.com/cockroachdb/cockroach/pkg/sql/rowexec/project_set.go:240: nextGeneratorValues()
github.com/cockroachdb/cockroach/pkg/sql/rowexec/project_set.go:312: Next()
github.com/cockroachdb/cockroach/pkg/sql/colexec/columnarizer.go:239: Next()
github.com/cockroachdb/cockroach/pkg/sql/colflow/stats.go:118: next()
...

It is likely to have the same root cause as #113186 and #114846.

Jira issue: CRDB-36960

The text was updated successfully, but these errors were encountered:

121163: rowexec: prevent panics with UDFs in edge cases r=yuzefovich a=yuzefovich This commit makes it so that we don't panic with assertion failure in the projectSet processor whenever the datum is of an unexpected type. We've seen this scenario happen in a few different UDFs / SPs, and normally it would lead to an internal error, but if vectorized engine is disabled, it would crash the node. This patch makes it so that it's always an internal error. Informs: #113186. Informs: #114846. Informs: #120942. Epic: None Release note: None Co-authored-by: Yahor Yuzefovich <[email protected]>

This commit changes the handling of tuple-returning routines to mirror that of postgres. In particular, when the routine return type is a tuple, postgres first attempts to coerce result columns to the return type of the routine. Only if that attempt fails, postgres wraps the result column in a tuple, and again attempts the coercion. This change affects the handling of routines that return (for example) a single composite-typed column. For example, the following two logic tests should produce the same result: ``` statement ok CREATE TYPE two_typ AS (x INT, y INT); CREATE FUNCTION f() RETURNS two_typ LANGUAGE SQL AS $$ SELECT 1, 2; $$; query T SELECT f(); ---- (1,2) ``` vs ``` statement ok CREATE TYPE two_typ AS (x INT, y INT); CREATE FUNCTION f() RETURNS two_typ LANGUAGE SQL AS $$ SELECT ROW(1, 2); $$; query T SELECT f(); ---- (1,2) ``` There is not release note, since this shouldn't affect versions prior to 24.1. Fixes cockroachdb#120942 Release note: None

119616: opt: correctly reconcile column definition list with RECORD-returning UDFs r=DrewKimball a=DrewKimball #### sql: remove usages of `types.IsRecordType` Previously, the `types.IsRecordType` function was used in different contexts (with vs without OUT-params, function params vs return type). This made it difficult to determine whether a particular usage was correct, and led to a few bugs in cases where additional checks were necessary. This commit replaces usages of `types.IsRecordType` with either: 1. `typ.Identical(types.AnyTuple)`, or 2. `typ.Oid() == oid.T_record` The former should be used for a RECORD-returning routine with no OUT-parameters, as well as for a RECORD-typed variable. The latter should be used to match either a RECORD-returning routine, or one with multiple OUT-parameters. Informs #114846 Release note: None #### optbuilder: use actual arg types when building routine with wildcard types Previously, we would always pass the static parameter types when building the routine. However, in some cases the static type is a wildcard, so we actually need to use the actual argument type. Note that always using the actual argument type can be incorrect (e.g. we'd lose the tuple labels present in the static type). Release note: None #### opt: refactor optbuild paths for routines and generator functions This commit heavily refactors the type-handling logic for routines and generator functions. The hope is to make the code more readable, and also make the changes in the next commit easier. Informs #114846 Release note: None #### opt: add assignment casts for UDFs used as a data source Previously, attempting to use a RECORD-returning UDF as a data source (e.g. `SELECT * FROM` syntax) would result in an internal error if the column definition list types didn't match the columns of the last statement. This commit fixes that by adding validation that the types are either identical or can be assignment-casted, and adding assignment casts if necessary. Fixes #114846 Fixes #113186 Release note (bug fix): Fixed a bug that could cause an internal error of the form `invalid datum type given: ..., expected ...` when a RECORD-returning UDF used as a data source was supplied a column definition list with mismatched types. This bug has existed since v23.1.0. #### opt/optbuilder: check for coercibility instead of tuple types This commit changes the handling of tuple-returning routines to mirror that of postgres. In particular, when the routine return type is a tuple, postgres first attempts to coerce result columns to the return type of the routine. Only if that attempt fails, postgres wraps the result column in a tuple, and again attempts the coercion. This change affects the handling of routines that return (for example) a single composite-typed column. For example, the following two logic tests should produce the same result: ``` statement ok CREATE TYPE two_typ AS (x INT, y INT); CREATE FUNCTION f() RETURNS two_typ LANGUAGE SQL AS $$ SELECT 1, 2; $$; query T SELECT f(); ---- (1,2) ``` vs ``` statement ok CREATE TYPE two_typ AS (x INT, y INT); CREATE FUNCTION f() RETURNS two_typ LANGUAGE SQL AS $$ SELECT ROW(1, 2); $$; query T SELECT f(); ---- (1,2) ``` There is not release note, since this shouldn't affect versions prior to 24.1. Fixes #120942 Release note: None Co-authored-by: Drew Kimball <[email protected]> Co-authored-by: Yahor Yuzefovich <[email protected]>

This commit changes the handling of tuple-returning routines to mirror that of postgres. In particular, when the routine return type is a tuple, postgres first attempts to coerce result columns to the return type of the routine. Only if that attempt fails, postgres wraps the result column in a tuple, and again attempts the coercion. This change affects the handling of routines that return (for example) a single composite-typed column. For example, the following two logic tests should produce the same result: ``` statement ok CREATE TYPE two_typ AS (x INT, y INT); CREATE FUNCTION f() RETURNS two_typ LANGUAGE SQL AS $$ SELECT 1, 2; $$; query T SELECT f(); ---- (1,2) ``` vs ``` statement ok CREATE TYPE two_typ AS (x INT, y INT); CREATE FUNCTION f() RETURNS two_typ LANGUAGE SQL AS $$ SELECT ROW(1, 2); $$; query T SELECT f(); ---- (1,2) ``` There is not release note, since this shouldn't affect versions prior to 24.1. Fixes #120942 Release note: None

cockroach-teamcity added this to SQL Queries Mar 23, 2024

github-project-automation bot moved this to Triage in SQL Queries Mar 23, 2024

mgartner moved this from Triage to 24.1 Release in SQL Queries Mar 26, 2024

yuzefovich mentioned this issue Mar 26, 2024

rowexec: prevent panics with UDFs in edge cases #121163

Merged

blathers-crl bot mentioned this issue Mar 27, 2024

release-23.2: rowexec: prevent panics with UDFs in edge cases #121172

Merged

DrewKimball self-assigned this Apr 3, 2024

DrewKimball moved this from 24.1 Release to Active in SQL Queries Apr 3, 2024

yuzefovich mentioned this issue Apr 4, 2024

opt: correctly reconcile column definition list with RECORD-returning UDFs #119616

Merged

craig bot closed this as completed in 24614e1 Apr 12, 2024

github-project-automation bot moved this from Active to Done in SQL Queries Apr 12, 2024

blathers-crl bot mentioned this issue Apr 12, 2024

release-24.1: opt: correctly reconcile column definition list with RECORD-returning UDFs #122305

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sql: internal error when using UDFs with OUT parameters #120942

sql: internal error when using UDFs with OUT parameters #120942

yuzefovich commented Mar 23, 2024 •

edited by cockroach-jira-scripts

Loading

sql: internal error when using UDFs with OUT parameters #120942

sql: internal error when using UDFs with OUT parameters #120942

Comments

yuzefovich commented Mar 23, 2024 • edited by cockroach-jira-scripts Loading

yuzefovich commented Mar 23, 2024 •

edited by cockroach-jira-scripts

Loading