Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: support implicit record arg and return types in UDFs #87275

Merged
merged 5 commits into from
Sep 8, 2022

Conversation

mgartner
Copy link
Collaborator

@mgartner mgartner commented Sep 1, 2022

opt: wrap last statement in a UDF with LIMIT 1

A UDF always returns a single row from the last statement in the UDF
(unless it is a set returning UDF, but we do not currently support
those). So, we can wrap the last statement in a LIMIT 1 expression
which may allow additional optimizations to trigger.

Release justification: This is a minor change to a new feature.

Release note: None

opt: add implicit record types to opt test catalog

Release justification: This is a test-only change.

Release note: None

sql: support implicit record arg and return types in UDFs

Implicit record types are now supported as argument types and return
types of user-defined functions. For example:

CREATE TABLE t (a INT PRIMARY KEY, b INT);
CREATE FUNCTION ret_t() RETURNS t LANGUAGE SQL AS $$
  SELECT * FROM t
$$;
CREATE FUNCTION t_a(some_t t) RETURNS INT LANGUAGE SQL AS $$
  SELECT (some_t).a
$$;

Fixes #86393

Release justification: This fixes a limitation of a newly added feature.

Release note: None

sql: propagate UDF errors correctly

This commit fixes a bug where an error that occurred during evaluation
of a UDF was not correctly propagated.

Release justification: This fixes a critical bug with a new feature.

Release note: None

sql: cast UDF result to function return type

The last statement of a UDF may have a resulting type, R, that does
not match the function's given return type, F. This is allowed if a
cast from R to F is valid in an implicit or assignment context.

This commit adds an assignment cast operator to the output column of the
final statement in the function body, if necessary.

Release justification: This fixes a minor bug with a new feature.

Release Note: None

@mgartner mgartner requested a review from a team as a code owner September 1, 2022 15:22
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Collaborator

@DrewKimball DrewKimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! After you've figured out whatever's going on with the record typing, :lgtm:

Reviewed 5 of 5 files at r1, 2 of 2 files at r2, 4 of 4 files at r3, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @chengxiong-ruan, @mgartner, and @msirek)


pkg/sql/logictest/testdata/logic_test/udf line 1560 at r3 (raw file):

$$

# TODO(mgartner): The type of imp_const_tup() should be imp, not record.

[nit] The type is alread imp on this one.


pkg/sql/opt/optbuilder/scalar.go line 667 at r1 (raw file):

		// rows after the first can simply be ignored. The limit could be
		// beneficial because it could allow additional optimization.
		if lastStmt := i == len(stmts)-1; lastStmt {

[very nit] I feel like this is more confusing than just using i == len(stmts)-1, or pulling the assignment out to the previous line.

Copy link
Contributor

@msirek msirek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @chengxiong-ruan and @mgartner)


pkg/sql/opt/testutils/testcat/types.go line 71 at r2 (raw file):

			contents := make([]*types.T, 0, tab.ColumnCount())
			labels := make([]string, 0, tab.ColumnCount())
			for i, n := 0, tab.ColumnCount(); i < n; i++ {

Since we're only building a tuple in the multicolumn case, maybe we shouldn't be trying to execute this code when tab.ColumnCount() is 1, but break out and return the error below.

Copy link
Collaborator Author

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To fix the type issues I had to implement assignment-casting of return values. I also caught a bug where errors were not being correctly propagated. PTAL.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @chengxiong-ruan, @DrewKimball, and @msirek)


pkg/sql/logictest/testdata/logic_test/udf line 1560 at r3 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

[nit] The type is alread imp on this one.

Done.


pkg/sql/opt/optbuilder/scalar.go line 667 at r1 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

[very nit] I feel like this is more confusing than just using i == len(stmts)-1, or pulling the assignment out to the previous line.

Done.


pkg/sql/opt/testutils/testcat/types.go line 71 at r2 (raw file):

Previously, msirek (Mark Sirek) wrote…

Since we're only building a tuple in the multicolumn case, maybe we shouldn't be trying to execute this code when tab.ColumnCount() is 1, but break out and return the error below.

Implicit record types can have a single column - if the table that they represent has a single column. For example:

defaultdb> CREATE TABLE t (a INT);
CREATE TABLE

defaultdb> SELECT row(1)::t, (row(1)::t).a;
  row | a
------+----
  (1) | 1
(1 row)

I've added a test to make sure the test catalog can handle this case correctly.

@mgartner mgartner requested a review from DrewKimball September 2, 2022 19:30
Copy link
Collaborator

@DrewKimball DrewKimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 6 of 6 files at r4, 2 of 2 files at r5, 4 of 4 files at r6, 2 of 2 files at r7, 4 of 4 files at r8, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @chengxiong-ruan, @DrewKimball, @mgartner, and @msirek)


pkg/sql/routine.go line 58 at r7 (raw file):

			_ = p.Txn().ConfigureStepping(ctx, prevSteppingMode)
			seqErr := txn.SetReadSeqNum(prevSeqNum)
			// Propagate the original error, if there is one.

Should this logic even be executed if there was already an error?


pkg/sql/logictest/testdata/logic_test/udf line 1710 at r8 (raw file):

SELECT stoc('a'), pg_typeof(stoc('a'))
----
a  text

I'd think the type would be character - is there some bug here (maybe not related to udfs) or is this expected behavior?

postgres=# select pg_typeof('b'::CHAR);
 pg_typeof
-----------
 character
(1 row)

vs

root@localhost:26257/defaultdb> select pg_typeof('b'::CHAR);

  pg_typeof
-------------
  text
(1 row)

Copy link
Contributor

@msirek msirek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 3 of 5 files at r1, 6 of 6 files at r4, 2 of 2 files at r5, 4 of 4 files at r6, 2 of 2 files at r7, 4 of 4 files at r8, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @chengxiong-ruan, @DrewKimball, and @mgartner)


pkg/sql/opt/testutils/testcat/types.go line 71 at r2 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

Implicit record types can have a single column - if the table that they represent has a single column. For example:

defaultdb> CREATE TABLE t (a INT);
CREATE TABLE

defaultdb> SELECT row(1)::t, (row(1)::t).a;
  row | a
------+----
  (1) | 1
(1 row)

I've added a test to make sure the test catalog can handle this case correctly.

OK, thanks. Do we want to qualify the name of the object type, including the schema (if it differs from the current schema) as in postgres?

CREATE TABLE t (a INT);
CREATE SCHEMA mws;
CREATE TABLE mws.t (a char(10));
SELECT pg_typeof(row(1)::t), pg_typeof(row(1)::mws.t);

crdb:
SELECT pg_typeof(row(1)::t), pg_typeof(row(1)::mws.t);
  pg_typeof | pg_typeof
------------+------------
  t         | t
(1 row)

postgres:
postgres=# SELECT pg_typeof(row(1)::t), pg_typeof(row(1)::mws.t);
 pg_typeof | pg_typeof 
-----------+-----------
 t         | mws.t

postgres=# set schema 'mws';
SET
postgres=# SELECT pg_typeof(row(1)::t), pg_typeof(row(1)::mws.t);
 pg_typeof | pg_typeof 
-----------+-----------
 t         | t

I'm not sure if we'd want to qualify database name too. Postgres doesn't support such references:

implicit=# SELECT pg_typeof(row(1)::postgres.mws.t);
ERROR:  cross-database references are not implemented: postgres.mws.t

(1 row)

A UDF always returns a single row from the last statement in the UDF
(unless it is a set returning UDF, but we do not currently support
those). So, we can wrap the last statement in a `LIMIT 1` expression
which may allow additional optimizations to trigger.

Release justification: This is a minor change to a new feature.

Release note: None
Release justification: This is a test-only change.

Release note: None
Implicit record types are now supported as argument types and return
types of user-defined functions. For example:

    CREATE TABLE t (a INT PRIMARY KEY, b INT);
    CREATE FUNCTION ret_t() RETURNS t LANGUAGE SQL AS $$
      SELECT * FROM t
    $$;
    CREATE FUNCTION t_a(some_t t) RETURNS INT LANGUAGE SQL AS $$
      SELECT (some_t).a
    $$;

Fixes cockroachdb#86393

Release justification: This fixes a limitation of a newly added feature.

Release note: None
This commit fixes a bug where an error that occurred during evaluation
of a UDF was not correctly propagated.

Release justification: This fixes a critical bug with a new feature.

Release note: None
The last statement of a UDF may have a resulting type, `R`, that does
not match the function's given return type, `F`. This is allowed if a
cast from `R` to `F` is valid in an implicit or assignment context.

This commit adds an assignment cast operator to the output column of the
final statement in the function body, if necessary.

Release justification: This fixes a minor bug with a new feature.

Release Note: None
Copy link
Collaborator Author

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @chengxiong-ruan and @DrewKimball)


pkg/sql/routine.go line 58 at r7 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

Should this logic even be executed if there was already an error?

That's a good point. If the UDF errors, it should abort the txn, so I don't think we need to do this. I've updated it.


pkg/sql/logictest/testdata/logic_test/udf line 1710 at r8 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

I'd think the type would be character - is there some bug here (maybe not related to udfs) or is this expected behavior?

postgres=# select pg_typeof('b'::CHAR);
 pg_typeof
-----------
 character
(1 row)

vs

root@localhost:26257/defaultdb> select pg_typeof('b'::CHAR);

  pg_typeof
-------------
  text
(1 row)

It's a known issue tracked by #58252.


pkg/sql/opt/testutils/testcat/types.go line 71 at r2 (raw file):

Previously, msirek (Mark Sirek) wrote…

OK, thanks. Do we want to qualify the name of the object type, including the schema (if it differs from the current schema) as in postgres?

CREATE TABLE t (a INT);
CREATE SCHEMA mws;
CREATE TABLE mws.t (a char(10));
SELECT pg_typeof(row(1)::t), pg_typeof(row(1)::mws.t);

crdb:
SELECT pg_typeof(row(1)::t), pg_typeof(row(1)::mws.t);
  pg_typeof | pg_typeof
------------+------------
  t         | t
(1 row)

postgres:
postgres=# SELECT pg_typeof(row(1)::t), pg_typeof(row(1)::mws.t);
 pg_typeof | pg_typeof 
-----------+-----------
 t         | mws.t

postgres=# set schema 'mws';
SET
postgres=# SELECT pg_typeof(row(1)::t), pg_typeof(row(1)::mws.t);
 pg_typeof | pg_typeof 
-----------+-----------
 t         | t

I'm not sure if we'd want to qualify database name too. Postgres doesn't support such references:

implicit=# SELECT pg_typeof(row(1)::postgres.mws.t);
ERROR:  cross-database references are not implemented: postgres.mws.t

(1 row)

We could qualify the names and make this more robust, but we generally only add the bare minimum functionality to the test catalog to keep it simple. Since I don't need to test qualified names in the optbuilder tests I added, I left it out.

Copy link
Collaborator

@DrewKimball DrewKimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 8 of 8 files at r9, 2 of 2 files at r10, 4 of 4 files at r11, 2 of 2 files at r12.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @chengxiong-ruan and @DrewKimball)

@mgartner
Copy link
Collaborator Author

mgartner commented Sep 8, 2022

TFTRs!

bors r+

@craig
Copy link
Contributor

craig bot commented Sep 8, 2022

Build succeeded:

@craig craig bot merged commit a3dee50 into cockroachdb:master Sep 8, 2022
@mgartner mgartner deleted the udf-limit branch September 9, 2022 16:52
@rafiss
Copy link
Collaborator

rafiss commented Sep 25, 2022

blathers backport 22.2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

sql: udf to support implicit record argument/return type
6 participants