Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: enable some mutations in udfs #102773

Merged
merged 4 commits into from
May 17, 2023

Conversation

rharding6373
Copy link
Collaborator

@rharding6373 rharding6373 commented May 4, 2023

sql: enable insert in udf bodies

This PR enables INSERT in UDFs and adds some logic test coverage.

There are currently some known limitations to mutations in UDFs, namely
foreign key cascades and some constraint checking which will be
addressed in future PRs.

Epic: CRDB-19255
Informs: #87289

Release note (sql change): Enables INSERT commands in UDF statement
bodies.

sql: enable update and upsert in udf bodies

This PR enables UPDATE in UDFs and adds some logic test coverage. Since
both INSERT and UPDATE are supported, this also enabled UPSERT.

Epic: CRDB-19255
Informs: #87289

Release note (sql change): Enables UPDATE and UPSERT commands in UDF
statement bodies.

sql: add upsert tests for udfs

This PR adds logic tests for UPSERTs in UDF bodies.

Epic: CRDB-19255
Informs: #87289

Release note: None

sql: fix record-returning udfs with no result

Previously, record-returning UDFs whose last SQL statement returned no
result (i.e., not a SELECT or RETURNING statement) would succeed. They
now return a return type mismatch error, as expected.

Epic: CRDB-19255
Informs: #87289

Release note: None

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@rharding6373 rharding6373 force-pushed the 20230419_udf_mutations_87289 branch 3 times, most recently from 6b163be to 09e361c Compare May 10, 2023 05:40
@rharding6373 rharding6373 changed the title sql: enable insert in udfs sql: enable some mutations in udfs May 10, 2023
@rharding6373 rharding6373 marked this pull request as ready for review May 10, 2023 05:42
@rharding6373 rharding6373 requested a review from a team as a code owner May 10, 2023 05:42
@rharding6373 rharding6373 force-pushed the 20230419_udf_mutations_87289 branch from 09e361c to 3c16454 Compare May 10, 2023 16:22
Copy link
Collaborator

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very cool! I've only reviewed the first commit so far.

Reviewed 14 of 14 files at r4.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @DrewKimball and @rharding6373)


pkg/sql/opt/optbuilder/builder.go line 321 at r4 (raw file):

		case *tree.Delete:
			panic(unimplemented.NewWithIssuef(87289, "%s usage inside a function definition", stmt.StatementTag()))
		case *tree.Insert:

nit: you can combine this case with the *tree.Select and tree.SelectStatement cases: case *tree.Select, tree.SelectStatement, *tree.Insert:.


pkg/sql/logictest/testdata/logic_test/udf line 74 at r4 (raw file):


statement error pgcode 42P01 pq: relation "dne" does not exist
CREATE FUNCTION err() RETURNS INT LANGUAGE SQL AS 'SELECT a FROM dne'

Where did this test come from?


pkg/sql/logictest/testdata/logic_test/udf_insert line 5 at r4 (raw file):


statement ok
CREATE FUNCTION f_should_err() RETURNS RECORD AS

This should error because the return type doesn't match, correct? Can you leave a TODO to explain that?


pkg/sql/logictest/testdata/logic_test/udf_insert line 11 at r4 (raw file):


statement error pq: at or near "as": syntax error
CREATE FUNCTION f_err() AS

nit: syntax errors probably don't need to be tested here, unless there was something specific to INSERTs about this case


pkg/sql/logictest/testdata/logic_test/udf_insert line 17 at r4 (raw file):


statement error missing "a" primary key column
CREATE FUNCTION f_err(b INT) RETURNS RECORD AS

Nice test case!


pkg/sql/logictest/testdata/logic_test/udf_insert line 36 at r4 (raw file):


query T
SELECT f_insert(3,4);

Can you add a test that calls f_insert again with the same value for a to ensure there is a duplicate key error?

Some other ideas of interesting tests:

  1. Errors for STABLE and IMMUTABLE functions that contain INSERT. Postgres errors lazily at runtime of the function, but maybe we want to error during creation of the function since that's a better user experience?

  2. Ensure that later calls to a function can see inserts from prior invocations:

CREATE TABLE t (a int);

CREATE FUNCTION f(i INT) RETURNS SETOF INT LANGUAGE SQL AS $$
  INSERT INTO t VALUES (i);
  SELECT a FROM t;
$$;

-- The first call, f(1), sees its own insert. The second call, f(2), sees both inserted values.
SELECT * FROM f(1)
UNION ALL
SELECT * FROM f(2);
 f
---
 1
 1
 2
(3 rows)

pkg/sql/logictest/testdata/logic_test/udf_insert line 188 at r4 (raw file):

(5,100,101)

# TODO(102771): Expected error due to f_drop's dependency on c.

nit: add # in front of issue number

@rharding6373 rharding6373 force-pushed the 20230419_udf_mutations_87289 branch from 3c16454 to d078a37 Compare May 10, 2023 23:41
Copy link
Collaborator Author

@rharding6373 rharding6373 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @DrewKimball and @mgartner)


pkg/sql/logictest/testdata/logic_test/udf line 74 at r4 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

Where did this test come from?

This is actually from the first test in insert: https://github.com/cockroachdb/cockroach/blob/master/pkg/sql/logictest/testdata/logic_test/insert#L1

I saw that we didn't test that a table exists, only columns, in UDFs, and this seemed like a better place to cover it.


pkg/sql/logictest/testdata/logic_test/udf_insert line 5 at r4 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

This should error because the return type doesn't match, correct? Can you leave a TODO to explain that?

Added (Though I fix this in the last commit...maybe the commits are suboptimally reordered).


pkg/sql/logictest/testdata/logic_test/udf_insert line 11 at r4 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

nit: syntax errors probably don't need to be tested here, unless there was something specific to INSERTs about this case

I added it as a follow-on to the previous test. We should always return something from a UDF. But you're right we cover this in other non-logic tests.


pkg/sql/logictest/testdata/logic_test/udf_insert line 36 at r4 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

Can you add a test that calls f_insert again with the same value for a to ensure there is a duplicate key error?

Some other ideas of interesting tests:

  1. Errors for STABLE and IMMUTABLE functions that contain INSERT. Postgres errors lazily at runtime of the function, but maybe we want to error during creation of the function since that's a better user experience?

  2. Ensure that later calls to a function can see inserts from prior invocations:

CREATE TABLE t (a int);

CREATE FUNCTION f(i INT) RETURNS SETOF INT LANGUAGE SQL AS $$
  INSERT INTO t VALUES (i);
  SELECT a FROM t;
$$;

-- The first call, f(1), sees its own insert. The second call, f(2), sees both inserted values.
SELECT * FROM f(1)
UNION ALL
SELECT * FROM f(2);
 f
---
 1
 1
 2
(3 rows)

Done, good ideas!

I'm surprised that the last one worked, though. I would have expected it to fail due to #70731. I'm going to take some time to understand this issue in more detail.

Copy link
Collaborator

@DrewKimball DrewKimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 14 of 14 files at r8, 11 of 11 files at r9, 9 of 9 files at r10, 2 of 2 files at r11, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @mgartner and @rharding6373)


pkg/sql/logictest/testdata/logic_test/udf_insert line 36 at r4 (raw file):

I'm surprised that the last one worked, though. I would have expected it to fail due to #70731.

Is it because we configure txn stepping for volatile UDFs?

Another interesting test case might be a single query that tries to insert the same row twice, so the duplicate key comes from the same txn.


pkg/sql/logictest/testdata/logic_test/udf_insert line 5 at r11 (raw file):


statement error pq: return type mismatch in function declared to return record\nDETAIL: Function's final statement must be SELECT or INSERT/UPDATE/DELETE RETURNING.
CREATE FUNCTION f_err() RETURNS RECORD AS

Could we have a similar function that returns VOID so it succeeds despite the lack of RETURNING clause?

@rharding6373 rharding6373 force-pushed the 20230419_udf_mutations_87289 branch from d078a37 to 88215ef Compare May 15, 2023 23:43
Copy link
Collaborator Author

@rharding6373 rharding6373 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTR!

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @DrewKimball and @mgartner)


pkg/sql/logictest/testdata/logic_test/udf_insert line 36 at r4 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

I'm surprised that the last one worked, though. I would have expected it to fail due to #70731.

Is it because we configure txn stepping for volatile UDFs?

Another interesting test case might be a single query that tries to insert the same row twice, so the duplicate key comes from the same txn.

If I disable stepping we don't get any result rows (because the statements can't see updated table values), but it doesn't return an error. I think I need to play with some test cases that try to repro the CTE inconsistency with UDFs instead of or inside of CTEs, but I'll add that test suite in another PR.

Added a couple tests with duplicate keys.


pkg/sql/logictest/testdata/logic_test/udf_insert line 5 at r11 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

Could we have a similar function that returns VOID so it succeeds despite the lack of RETURNING clause?

Done. Found a bug, too.

@rharding6373
Copy link
Collaborator Author

TFTR! I'm going to merge this PR to unblock some of my other work this week. If you have any other suggestions about mutation functionality we want to test (my list currently includes foreign key cascades and CTEs with UDFs), please let me know and I can add them in separate PRs.

bors r+

@craig
Copy link
Contributor

craig bot commented May 16, 2023

Build failed:

This PR enables INSERT in UDFs and adds some logic test coverage.

There are currently some known limitations to mutations in UDFs, namely
foreign key cascades and some constraint checking which will be
addressed in future PRs.

Epic: CRDB-19255
Informs: cockroachdb#87289

Release note (sql change): Enables INSERT commands in UDF statement
bodies.
@rharding6373 rharding6373 force-pushed the 20230419_udf_mutations_87289 branch from 88215ef to e8f3fda Compare May 16, 2023 18:58
@rharding6373
Copy link
Collaborator Author

bors r+

@craig
Copy link
Contributor

craig bot commented May 16, 2023

Build failed (retrying...):

@rharding6373
Copy link
Collaborator Author

bors cancel

@craig
Copy link
Contributor

craig bot commented May 16, 2023

Canceled.

@yuzefovich
Copy link
Member

I think the CI failure is a flake being introduced in this PR.

=== RUN   TestLogic_udf_upsert/on_conflict_do_nothing
    logic.go:2879: 
         
        /home/roach/.cache/bazel/_bazel_roach/c5a4e7d36696d9cd970af2045211a7df/sandbox/processwrapper-sandbox/7550/execroot/com_github_cockroachdb_cockroach/bazel-out/k8-fastbuild/bin/pkg/sql/logictest/tests/fakedist/fakedist_test_/fakedist_test.runfiles/com_github_cockroachdb_cockroach/pkg/sql/logictest/testdata/logic_test/udf_upsert:45: SELECT * FROM t_ocdn;
        expected:
            1 1 1
            5 5 5
            
        but found (query options: "") :
            5  5  5
            1  1  1

bors r-

@rharding6373
Copy link
Collaborator Author

Need to do a more thorough audit for rowsort opportunities. Will retry after stressing more locally. Sorry about slowing the merge queue.

This PR enables UPDATE in UDFs and adds some logic test coverage. Since
both INSERT and UPDATE are supported, this also enabled UPSERT.

Epic: CRDB-19255
Informs: cockroachdb#87289

Release note (sql change): Enables UPDATE and UPSERT commands in UDF
statement bodies.
This PR adds logic tests for UPSERTs in UDF bodies.

Epic: CRDB-19255
Informs: cockroachdb#87289

Release note: None
Previously, record-returning UDFs whose last SQL statement returned no
result (i.e., not a SELECT or RETURNING statement) would succeed. They
now return a return type mismatch error, as expected.

Epic: CRDB-19255
Informs: cockroachdb#87289

Release note: None
@rharding6373 rharding6373 force-pushed the 20230419_udf_mutations_87289 branch from e8f3fda to 6857822 Compare May 16, 2023 22:38
@rharding6373
Copy link
Collaborator Author

I think I fixed all the potential flakes due to row order. Let's try again.

bors r+

@craig
Copy link
Contributor

craig bot commented May 17, 2023

Build succeeded:

@craig craig bot merged commit 7607881 into cockroachdb:master May 17, 2023
@rharding6373 rharding6373 deleted the 20230419_udf_mutations_87289 branch May 17, 2023 00:49
Copy link
Collaborator

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nicely done! Sorry for the late drive-by - I have just a few questions.

IIUC the last commit doesn't need to be backported to 23.1 because it's only possible if non-SELECT statements are allowed in UDFs, correct? Can it reproduce with a SETOF UDF with an empty body?

Reviewed 1 of 15 files at r12, 13 of 14 files at r16, 2 of 11 files at r17, 11 of 12 files at r20, 8 of 9 files at r21, 2 of 2 files at r22, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale)


pkg/sql/logictest/testdata/logic_test/udf_insert line 36 at r4 (raw file):

Previously, rharding6373 (Rachael Harding) wrote…

If I disable stepping we don't get any result rows (because the statements can't see updated table values), but it doesn't return an error. I think I need to play with some test cases that try to repro the CTE inconsistency with UDFs instead of or inside of CTEs, but I'll add that test suite in another PR.

Added a couple tests with duplicate keys.

I don't think we'd expect an error due to #70731, we'd expect index corruption. I forgot about that issue. It may be a concern even for a simple UDF with INSERTS/UPSERTS that is invoked multiple times. I'll try to come up with a reproduction of this issue.

Copy link
Collaborator

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, leaving a few more questions. Reviewable lost them the first time.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale)


pkg/sql/opt/optbuilder/scalar.go line 793 at r16 (raw file):

			// column is already a tuple.
			cols = physProps.Presentation
			if len(cols) > 0 {

Does this fix need to be backported?


pkg/sql/logictest/testdata/logic_test/udf_update line 72 at r20 (raw file):


statement error pq: multiple modification subqueries of the same table \"t\" are not supported
SELECT f2(5,9), f2(7,11);

Should we get this error in a case where f2 is referenced once but is called multiple times, e.g.:

SELECT f2(x, y) FROM (VALUES (1, 2), (3, 4)) v(x, y)

pkg/sql/logictest/testdata/logic_test/udf_upsert line 33 at r21 (raw file):


statement error pq: multiple modification subqueries of the same table \"t_ocdn\" are not supported
SELECT f_ocdn(1,1,1), f_ocdn(3,2,2), f_ocdn(6,6,2), f_ocdn(2,1,1);

I have the same question as the one in the update tests. Do we expect an error for:

SELECT f_ocdn(x, y, z) FROM (VALUES (1, 1, 1), (3, 2, 2)) v(x, y, z)

Copy link
Collaborator Author

@rharding6373 rharding6373 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale)


pkg/sql/opt/optbuilder/scalar.go line 793 at r16 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

Does this fix need to be backported?

I don't think we can hit this bug before 23.2, since we can only hit it if we return 0 columns, and before 23.2 we only support SELECT which must always return at least 1 column.


pkg/sql/logictest/testdata/logic_test/udf_insert line 36 at r4 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

I don't think we'd expect an error due to #70731, we'd expect index corruption. I forgot about that issue. It may be a concern even for a simple UDF with INSERTS/UPSERTS that is invoked multiple times. I'll try to come up with a reproduction of this issue.

It's possible the guard rails added in #70976 need to be expanded. We have another test case in this file that does fail due to the guard rail when we call something like SELECT f(), f(); where f() contains a mutation. Or we need to prioritize a fix for #70731.

Thanks for trying to repro, let me know if you find anything. I started some of my own investigation here: https://gist.github.com/rharding6373/bb2a81f55e8fceaa6b536cd86d67b937


pkg/sql/logictest/testdata/logic_test/udf_update line 72 at r20 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

Should we get this error in a case where f2 is referenced once but is called multiple times, e.g.:

SELECT f2(x, y) FROM (VALUES (1, 2), (3, 4)) v(x, y)

This (and the other query below) is successful. I tested it out and there does not appear to be any index corruption. I'll add this test in a follow up PR. The explain output is not currently helpful in seeing the plan (there's already an issue open for this). Notes are here: https://gist.github.com/rharding6373/bb2a81f55e8fceaa6b536cd86d67b937.


pkg/sql/logictest/testdata/logic_test/udf_upsert line 33 at r21 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

I have the same question as the one in the update tests. Do we expect an error for:

SELECT f_ocdn(x, y, z) FROM (VALUES (1, 1, 1), (3, 2, 2)) v(x, y, z)

See above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants