-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
opt: hoist uncorrelated subqueries at most once #115142
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice fix! Is this also a problem for UDFs? Also, what's stopping it from being a problem for correlated subqueries as well?
Reviewed 3 of 3 files at r1, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @mgartner)
-- commits
line 24 at r1:
[nit] ideas
-> ids
It is indeed, and this PR doesn't fix it. I'll probably modify this PR to address it.
There's nothing fundamental preventing this for correlated subqueries. In the test cases I've added, cockroach/pkg/sql/opt/norm/join_funcs.go Lines 294 to 296 in 70ba6b9
So it's possible that we duplicate a correlated subquery in some other way, and arrive at the same problem. However, we've not witnessed an occurrence of this error with a correlated subquery (to my knowledge), so I'm hesitant to add the same limitation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this also a problem for UDFs?
It is indeed, and this PR doesn't fix it. I'll probably modify this PR to address it.
Nevermind, I was confused. I think a query with UDFs could run into a similar issue when the UDF is inlined as a subquery, but this PR should prevent that. Is there a specific case you were thinking of that I should test?
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @mgartner)
bf39fbf
to
a221080
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @DrewKimball)
Previously, DrewKimball (Drew Kimball) wrote…
[nit]
ideas
->ids
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nevermind, I was confused. I think a query with UDFs could run into a similar issue when the UDF is inlined as a subquery, but this PR should prevent that. Is there a specific case you were thinking of that I should test?
Good point, I forgot that inlining a UDF just creates a subquery. It might be nice to have a simple test with a no-arg UDF that gets inlined.
Reviewed all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @mgartner)
Since cockroachdb#100881, the optimizer has hoisted uncorrelated subqueries used in an equality expression (only when the `optimizer_hoist_uncorrelated_equality_subqueries` session setting is enabled). This can cause problems when the hoisted subquery has been duplicated in the expression tree, e.g., when pushing a filter into both sides of a join. A subquery is a scalar expression, so the columns of its child expression are never emitted from the subquery. This makes it safe duplicate a subquery in an expression tree. However, when a subquery is hoisted, it is transformed into a join which can produce the columns of the child expression. Hoisting the same subquery multiple times can produce query plans with duplicate column IDs in two logically different expressions. This can lead to incorrect query plans (see the comment for `opt.Metadata`), as well as produce expressions with children that have intersecting column IDs (after additional normalization rules fire). To avoid these dangers, this commit ensures that each unique subquery is hoisted at most once. This will prevent bad plans, but it may not inhibit the optimizer from finding optimal plans. In the future, it may be possible to lift this restriction by generating new column IDs for uncorrelated subqueries each time they are hoisted. Fixes cockroachdb#114703 There is no release note because the session setting enabling this bug is disabled by default, and because the possible correctness bug is theoretical - we have not found a reproduction of a correctness bug, but it could exist in theory. Release note: None
a221080
to
4c4e5a4
Compare
Done. |
TFTRs! bors r+ |
Build succeeded: |
Since #100881, the optimizer has hoisted uncorrelated subqueries used in
an equality expression (only when the
optimizer_hoist_uncorrelated_equality_subqueries
session setting isenabled). This can cause problems when the hoisted subquery has been
duplicated in the expression tree, e.g., when pushing a filter into both
sides of a join.
A subquery is a scalar expression, so the columns of its child
expression are never emitted from the subquery. This makes it safe
duplicate a subquery in an expression tree. However, when a subquery is
hoisted, it is transformed into a join which can produce the columns of
the child expression. Hoisting the same subquery multiple times can
produce query plans with duplicate column IDs in two logically different
expressions. This can lead to incorrect query plans (see the comment for
opt.Metadata
), as well as produce expressions with children that haveintersecting column IDs (after additional normalization rules fire).
To avoid these dangers, this commit ensures that each unique subquery is
hoisted at most once. This will prevent bad plans, but it may not
inhibit the optimizer from finding optimal plans. In the future, it may
be possible to lift this restriction by generating new column IDs for
uncorrelated subqueries each time they are hoisted.
Fixes #114703
There is no release note because the session setting enabling this bug
is disabled by default, and because the possible correctness bug is
theoretical - we have not found a reproduction of a correctness bug, but
it could exist in theory.
Release note: None