Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

opt: build ANY expressions as regular subqueries within UDFs #98375

Merged
merged 2 commits into from
Mar 16, 2023

Conversation

mgartner
Copy link
Collaborator

@mgartner mgartner commented Mar 10, 2023

opt: build ANY expressions as regular subqueries within UDFs

To support execution of ANY expressions within UDFs, the optimizer
builds them as subqueries with GroupBy expressions instead. The
transformation simulates the peculiar behavior of ANY expressions. See
CustomFuncs.ConstructGroupByAny for more details on this
transformation.

By constructing ANY expressions within UDFs as subqueries, we avoid
adding complexity to the execution engine, which is not currently
capable of evaluating ANY expressions within the context of a UDF.

Fixes #87291

Release note: None

opt: fix GenerateIndexScans comment

Release note: None

@mgartner mgartner requested a review from a team as a code owner March 10, 2023 16:04
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@DrewKimball
Copy link
Collaborator

I haven't looked into the specifics yet, but a quick comment based on the PR message - agg filters can prevent decorrelation, so I think we're better off using something like bool_or(col IS NULL) AND NULL

Copy link
Collaborator

@rharding6373 rharding6373 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: Nice work!

Reviewed 1 of 1 files at r1, 5 of 5 files at r2, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @DrewKimball and @mgartner)


pkg/sql/opt/optbuilder/builder.go line 133 at r2 (raw file):

	// udfDepth is greater than zero, then the builder is currently building
	// expressions within one or more UDFs.
	udfDepth int

It seems like we only need to know if we're inside a UDF, so would a bool be sufficient to follow YAGNI principles?


pkg/sql/opt/optbuilder/subquery.go line 376 at r2 (raw file):

	if b.udfDepth > 0 {
		// Any expressions are cannot be built by the optimizer within a UDF, so

nit: Any expressions cannot

Copy link
Collaborator

@DrewKimball DrewKimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: Nice work!

Reviewed 1 of 1 files at r1, 4 of 5 files at r2, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained

Copy link
Collaborator Author

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't looked into the specifics yet, but a quick comment based on the PR message - agg filters can prevent decorrelation, so I think we're better off using something like bool_or(col IS NULL) AND NULL

Done.

It seems like we only need to know if we're inside a UDF, so would a bool be sufficient to follow YAGNI principles?

Ok, switched to a boolean for now. We'll have to use an integer once we support UDFs that call other UDFs.

nit: Any expressions cannot

Fixed.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @DrewKimball)

@DrewKimball
Copy link
Collaborator

Just found constructGroupByAny in decorrelate_funcs.go - were you aware of this?

@mgartner
Copy link
Collaborator Author

Just found constructGroupByAny in decorrelate_funcs.go - were you aware of this?

Heh, no. Looks like we reinvented the wheel here. And a worse version - constructGroupByAny is thorough in constructing an expression that can be decorrelated and avoid apply-joins. I'll see if I can reuse the logic there.

@mgartner
Copy link
Collaborator Author

This was a great find @DrewKimball! It should be easy to reuse the same code as constructGroupByAny AND I found a bug in that code which should be an easy fix, #98691. I'm hoping to update the PR shortly.

@mgartner
Copy link
Collaborator Author

Ok this is ready for another look. The first commit is from #98700.

Copy link
Collaborator

@DrewKimball DrewKimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: Nice refactoring!

Reviewed 9 of 9 files at r5, 6 of 6 files at r6, 1 of 1 files at r7, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @mgartner)


pkg/sql/opt/optbuilder/subquery.go line 377 at r6 (raw file):

	if b.insideUDF {
		// Any expressions cannot be built by the optimizer within a UDF, so
		// build them as subqueries with GroupBy expressions instead.

[nit] GroupBy->ScalarGroupBy

Copy link
Collaborator

@rharding6373 rharding6373 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r1, 5 of 5 files at r2.
Reviewable status: :shipit: complete! 2 of 0 LGTMs obtained (waiting on @mgartner)

Copy link
Collaborator Author

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 2 of 0 LGTMs obtained (waiting on @DrewKimball)


pkg/sql/opt/optbuilder/subquery.go line 377 at r6 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

[nit] GroupBy->ScalarGroupBy

Done.

To support execution of ANY expressions within UDFs, the optimizer
builds them as subqueries with GroupBy expressions instead. The
transformation simulates the peculiar behavior of ANY expressions. See
`CustomFuncs.ConstructGroupByAny` for more details on this
transformation.

By constructing ANY expressions within UDFs as subqueries, we avoid
adding complexity to the execution engine, which is not currently
capable of evaluating ANY expressions within the context of a UDF.

Fixes cockroachdb#87291

Release note: None
@mgartner
Copy link
Collaborator Author

TFTRs!

bors r+

@craig
Copy link
Contributor

craig bot commented Mar 16, 2023

Build succeeded:

@craig craig bot merged commit 8c4018a into cockroachdb:master Mar 16, 2023
@mgartner mgartner deleted the any-udf branch March 16, 2023 20:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

sql: allow subqueries in UDFs
4 participants