Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: internal error when using UDFs with OUT parameters #120942

Closed
yuzefovich opened this issue Mar 23, 2024 · 0 comments · Fixed by #119616
Closed

sql: internal error when using UDFs with OUT parameters #120942

yuzefovich opened this issue Mar 23, 2024 · 0 comments · Fixed by #119616
Assignees
Labels
A-sql-routine UDFs and Stored Procedures branch-release-24.1 Used to mark GA and release blockers, technical advisories, and bugs for 24.1 C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. GA-blocker T-sql-queries SQL Queries Team

Comments

@yuzefovich
Copy link
Member

yuzefovich commented Mar 23, 2024

This issue tracks addressing a few TODOs with commented out test cases where we currently encounter an internal error with UDFs and SPs with OUT parameters. For example,

CREATE FUNCTION f(INOUT param1 INT, OUT param2 INT) RETURNS RECORD AS $$
BEGIN
  param2 := 2;
  RAISE NOTICE '%', param2;
END
$$ LANGUAGE PLpgSQL;

SELECT * FROM f(3);

results in

NOTICE: 2
ERROR: internal error: invalid datum type given: RECORD, expected INT8
SQLSTATE: XX000
DETAIL: stack trace:
github.com/cockroachdb/cockroach/pkg/sql/rowenc/encoded_datum.go:195: DatumToEncDatum()
github.com/cockroachdb/cockroach/pkg/sql/rowexec/project_set.go:333: toEncDatum()
github.com/cockroachdb/cockroach/pkg/sql/rowexec/project_set.go:240: nextGeneratorValues()
github.com/cockroachdb/cockroach/pkg/sql/rowexec/project_set.go:312: Next()
github.com/cockroachdb/cockroach/pkg/sql/colexec/columnarizer.go:239: Next()
github.com/cockroachdb/cockroach/pkg/sql/colflow/stats.go:118: next()
...

It is likely to have the same root cause as #113186 and #114846.

Jira issue: CRDB-36960

@yuzefovich yuzefovich added C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. GA-blocker T-sql-queries SQL Queries Team A-sql-routine UDFs and Stored Procedures branch-release-24.1 Used to mark GA and release blockers, technical advisories, and bugs for 24.1 labels Mar 23, 2024
@github-project-automation github-project-automation bot moved this to Triage in SQL Queries Mar 23, 2024
@mgartner mgartner moved this from Triage to 24.1 Release in SQL Queries Mar 26, 2024
craig bot pushed a commit that referenced this issue Mar 27, 2024
121163: rowexec: prevent panics with UDFs in edge cases r=yuzefovich a=yuzefovich

This commit makes it so that we don't panic with assertion failure in the projectSet processor whenever the datum is of an unexpected type. We've seen this scenario happen in a few different UDFs / SPs, and normally it would lead to an internal error, but if vectorized engine is disabled, it would crash the node. This patch makes it so that it's always an internal error.

Informs: #113186.
Informs: #114846.
Informs: #120942.

Epic: None

Release note: None

Co-authored-by: Yahor Yuzefovich <[email protected]>
@DrewKimball DrewKimball self-assigned this Apr 3, 2024
@DrewKimball DrewKimball moved this from 24.1 Release to Active in SQL Queries Apr 3, 2024
DrewKimball added a commit to DrewKimball/cockroach that referenced this issue Apr 5, 2024
This commit changes the handling of tuple-returning routines to mirror
that of postgres. In particular, when the routine return type is a tuple,
postgres first attempts to coerce result columns to the return type of the
routine. Only if that attempt fails, postgres wraps the result column in
a tuple, and again attempts the coercion. This change affects the handling
of routines that return (for example) a single composite-typed column. For
example, the following two logic tests should produce the same result:
```
statement ok
CREATE TYPE two_typ AS (x INT, y INT);
CREATE FUNCTION f() RETURNS two_typ LANGUAGE SQL AS $$ SELECT 1, 2; $$;

query T
SELECT f();
----
(1,2)
```
vs
```
statement ok
CREATE TYPE two_typ AS (x INT, y INT);
CREATE FUNCTION f() RETURNS two_typ LANGUAGE SQL AS $$ SELECT ROW(1, 2); $$;

query T
SELECT f();
----
(1,2)
```
There is not release note, since this shouldn't affect versions prior to 24.1.

Fixes cockroachdb#120942

Release note: None
DrewKimball added a commit to DrewKimball/cockroach that referenced this issue Apr 9, 2024
This commit changes the handling of tuple-returning routines to mirror
that of postgres. In particular, when the routine return type is a tuple,
postgres first attempts to coerce result columns to the return type of the
routine. Only if that attempt fails, postgres wraps the result column in
a tuple, and again attempts the coercion. This change affects the handling
of routines that return (for example) a single composite-typed column. For
example, the following two logic tests should produce the same result:
```
statement ok
CREATE TYPE two_typ AS (x INT, y INT);
CREATE FUNCTION f() RETURNS two_typ LANGUAGE SQL AS $$ SELECT 1, 2; $$;

query T
SELECT f();
----
(1,2)
```
vs
```
statement ok
CREATE TYPE two_typ AS (x INT, y INT);
CREATE FUNCTION f() RETURNS two_typ LANGUAGE SQL AS $$ SELECT ROW(1, 2); $$;

query T
SELECT f();
----
(1,2)
```
There is not release note, since this shouldn't affect versions prior to 24.1.

Fixes cockroachdb#120942

Release note: None
craig bot pushed a commit that referenced this issue Apr 12, 2024
119616: opt: correctly reconcile column definition list with RECORD-returning UDFs r=DrewKimball a=DrewKimball

#### sql: remove usages of `types.IsRecordType`

Previously, the `types.IsRecordType` function was used in different
contexts (with vs without OUT-params, function params vs return type).
This made it difficult to determine whether a particular usage was
correct, and led to a few bugs in cases where additional checks
were necessary.

This commit replaces usages of `types.IsRecordType` with either:
1. `typ.Identical(types.AnyTuple)`, or
2. `typ.Oid() == oid.T_record`

The former should be used for a RECORD-returning routine with no
OUT-parameters, as well as for a RECORD-typed variable. The latter
should be used to match either a RECORD-returning routine, or one
with multiple OUT-parameters.

Informs #114846

Release note: None

#### optbuilder: use actual arg types when building routine with wildcard types

Previously, we would always pass the static parameter types when
building the routine. However, in some cases the static type is
a wildcard, so we actually need to use the actual argument type. Note
that always using the actual argument type can be incorrect (e.g. we'd
lose the tuple labels present in the static type).

Release note: None

#### opt: refactor optbuild paths for routines and generator functions

This commit heavily refactors the type-handling logic for routines and
generator functions. The hope is to make the code more readable, and also
make the changes in the next commit easier.

Informs #114846

Release note: None

#### opt: add assignment casts for UDFs used as a data source

Previously, attempting to use a RECORD-returning UDF as a data source
(e.g. `SELECT * FROM` syntax) would result in an internal error if the
column definition list types didn't match the columns of the last
statement. This commit fixes that by adding validation that the types
are either identical or can be assignment-casted, and adding assignment
casts if necessary.

Fixes #114846
Fixes #113186

Release note (bug fix): Fixed a bug that could cause an internal error of
the form `invalid datum type given: ..., expected ...` when a RECORD-returning
UDF used as a data source was supplied a column definition list with
mismatched types. This bug has existed since v23.1.0.

#### opt/optbuilder: check for coercibility instead of tuple types

This commit changes the handling of tuple-returning routines to mirror
that of postgres. In particular, when the routine return type is a tuple,
postgres first attempts to coerce result columns to the return type of the
routine. Only if that attempt fails, postgres wraps the result column in
a tuple, and again attempts the coercion. This change affects the handling
of routines that return (for example) a single composite-typed column. For
example, the following two logic tests should produce the same result:
```
statement ok
CREATE TYPE two_typ AS (x INT, y INT);
CREATE FUNCTION f() RETURNS two_typ LANGUAGE SQL AS $$ SELECT 1, 2; $$;

query T
SELECT f();
----
(1,2)
```
vs
```
statement ok
CREATE TYPE two_typ AS (x INT, y INT);
CREATE FUNCTION f() RETURNS two_typ LANGUAGE SQL AS $$ SELECT ROW(1, 2); $$;

query T
SELECT f();
----
(1,2)
```
There is not release note, since this shouldn't affect versions prior to 24.1.

Fixes #120942

Release note: None

Co-authored-by: Drew Kimball <[email protected]>
Co-authored-by: Yahor Yuzefovich <[email protected]>
@craig craig bot closed this as completed in 24614e1 Apr 12, 2024
@github-project-automation github-project-automation bot moved this from Active to Done in SQL Queries Apr 12, 2024
blathers-crl bot pushed a commit that referenced this issue Apr 12, 2024
This commit changes the handling of tuple-returning routines to mirror
that of postgres. In particular, when the routine return type is a tuple,
postgres first attempts to coerce result columns to the return type of the
routine. Only if that attempt fails, postgres wraps the result column in
a tuple, and again attempts the coercion. This change affects the handling
of routines that return (for example) a single composite-typed column. For
example, the following two logic tests should produce the same result:
```
statement ok
CREATE TYPE two_typ AS (x INT, y INT);
CREATE FUNCTION f() RETURNS two_typ LANGUAGE SQL AS $$ SELECT 1, 2; $$;

query T
SELECT f();
----
(1,2)
```
vs
```
statement ok
CREATE TYPE two_typ AS (x INT, y INT);
CREATE FUNCTION f() RETURNS two_typ LANGUAGE SQL AS $$ SELECT ROW(1, 2); $$;

query T
SELECT f();
----
(1,2)
```
There is not release note, since this shouldn't affect versions prior to 24.1.

Fixes #120942

Release note: None
DrewKimball added a commit that referenced this issue Apr 24, 2024
This commit changes the handling of tuple-returning routines to mirror
that of postgres. In particular, when the routine return type is a tuple,
postgres first attempts to coerce result columns to the return type of the
routine. Only if that attempt fails, postgres wraps the result column in
a tuple, and again attempts the coercion. This change affects the handling
of routines that return (for example) a single composite-typed column. For
example, the following two logic tests should produce the same result:
```
statement ok
CREATE TYPE two_typ AS (x INT, y INT);
CREATE FUNCTION f() RETURNS two_typ LANGUAGE SQL AS $$ SELECT 1, 2; $$;

query T
SELECT f();
----
(1,2)
```
vs
```
statement ok
CREATE TYPE two_typ AS (x INT, y INT);
CREATE FUNCTION f() RETURNS two_typ LANGUAGE SQL AS $$ SELECT ROW(1, 2); $$;

query T
SELECT f();
----
(1,2)
```
There is not release note, since this shouldn't affect versions prior to 24.1.

Fixes #120942

Release note: None
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-sql-routine UDFs and Stored Procedures branch-release-24.1 Used to mark GA and release blockers, technical advisories, and bugs for 24.1 C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. GA-blocker T-sql-queries SQL Queries Team
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

2 participants