Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

opt: inline UDFs as subqueries #92955

Merged
merged 1 commit into from
Jan 4, 2023
Merged

Conversation

mgartner
Copy link
Collaborator

@mgartner mgartner commented Dec 2, 2022

UDFs are now inlined as subqueries by a normalization rule when possible, speeding up their evaluation.

Epic: CRDB-20370

Release note (performance improvement): Some types of user-defined functions are now inlined in query plans as relation expressions, which speeds up their evaluation. UDFs must be non-volatile and have a single statement in the function body to be inlined.

@mgartner mgartner requested a review from a team as a code owner December 2, 2022 20:50
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Collaborator

@DrewKimball DrewKimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: Nice!

Reviewed 3 of 3 files at r1, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @mgartner and @rharding6373)


pkg/sql/opt/norm/inline_funcs.go line 440 at r1 (raw file):

	// argForParam returns the argument that can be substituted for the given
	// column, if the column is a parameter of the UDF. It returns ok=false if
	// the column is not a UDF parameter.

Nice, your work to disambiguate parameters and arguments is paying off here.


pkg/sql/opt/norm/inline_funcs.go line 481 at r1 (raw file):

		&memo.SubqueryPrivate{
			OriginalExpr: nil,
			Ordering:     ordering,

[nit] Is it necessary to propagate an ordering here, since the input is guaranteed to return at most one row? Any required ordering should already be preserved by the limit, right?

[also nit] why explicitly set OriginalExpr here?


pkg/sql/opt/norm/testdata/rules/inline line 1149 at r1 (raw file):

 │         └── "?column?":13 = 100 [outer=(13), constraints=(/13: [/100 - /100]; tight), fd=()-->(13)]
 └── projections
      └── (i:2 + 20) + 10 [as=add:20, outer=(2), immutable]

Why isn't this folded into i + 30? Just don't have a rule for it I guess?

Copy link
Collaborator

@rharding6373 rharding6373 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: :shipit: complete! 2 of 0 LGTMs obtained (waiting on @mgartner)

Copy link
Collaborator Author

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTRs!

bors r+

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @DrewKimball)


pkg/sql/opt/norm/inline_funcs.go line 440 at r1 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

Nice, your work to disambiguate parameters and arguments is paying off here.

🎉


pkg/sql/opt/norm/inline_funcs.go line 481 at r1 (raw file):
You're right, the Limit maintains the ordering. This might change when we support set-returning UDFs, so I've left some TODOs for now so we don't forget.

[also nit] why explicitly set OriginalExpr here?

Good catch. That was just leftover; I've removed it.


pkg/sql/opt/norm/testdata/rules/inline line 1149 at r1 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

Why isn't this folded into i + 30? Just don't have a rule for it I guess?

Ya, there's not a rule for it:

defaultdb> EXPLAIN (OPT) SELECT i+10+20 FROM generate_series(1, 5) g(i);
                  info
-----------------------------------------
  project
   ├── project-set
   │    ├── values
   │    │    └── ()
   │    └── zip
   │         └── generate_series(1, 5)
   └── projections
        └── (generate_series + 10) + 20
(8 rows)

add(10, add(i, 200)) = 100 would be normalized (in a project or filter) though because we have NormalizeCmpPlusConstwhich would fire twice.

@craig
Copy link
Contributor

craig bot commented Dec 12, 2022

Build failed (retrying...):

@yuzefovich
Copy link
Member

I think this PR is introducing a flake in TestLogic_udf/volatility.

bors r-

@craig
Copy link
Contributor

craig bot commented Dec 12, 2022

Canceled.

@mgartner
Copy link
Collaborator Author

@yuzefovich where do you see that?

@yuzefovich
Copy link
Member

The bors batch failed and then Bazel Essential CI on this PR is red for the same reason on fakedist-disk config, and I haven't seen this flake anywhere else, so I'm assuming it's this PR that is exposing the flake (or introducing it). It could be a coincidence though.

UDFs are now inlined as subqueries by a normalization rule when
possible, speeding up their evaluation.

Release note (performance improvement): Some types of user-defined
functions are now inlined in query plans as relation expressions, which
speeds up their evaluation. UDFs must be non-volatile and have a single
statement in the function body to be inlined.
@mgartner
Copy link
Collaborator Author

mgartner commented Jan 4, 2023

Fixed the flake.

bors r+

@craig
Copy link
Contributor

craig bot commented Jan 4, 2023

Build succeeded:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants