-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
opt: do not generate unnecessary cross-joins on join input #79389
Conversation
e9b2de5
to
f01e2f9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 9 of 9 files at r1, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @cucaroach)
For cases (2) and (3) above -- could those also come from a check constraint? |
This commit fixes a bug that caused unnecessary cross-joins on the input of lookup joins, causing both suboptimal query plans and incorrect query results. The bug only affected lookup joins with lookup expressions. Fixes cockroachdb#79384 Release note (bug fix): A bug has been fixed that caused the optimizer to generate query plans with logically incorrect lookup joins. The bug can only occur in queries with an inner join, e.g., `t1 JOIN t2`, if all of the following are true: 1. The join contains an equality condition between columns of both tables, e.g., `t1.a = t2.a`. 2. A query filter or `CHECK` constraint constrains a column to a set of specific values, e.g., `t2.b IN (1, 2, 3)`. In the case of a `CHECK` constraint, the column must be `NOT NULL`. 3. A query filter or `CHECK` constraint constrains a column to a range, e.g., `t2.c > 0`. In the case of a `CHECK` constraint, the column must be `NOT NULL`. 4. An index contains a column from each of the criteria above, e.g., `INDEX t2(a, b, c)`. This bug has been present since version 21.2.0.
f01e2f9
to
3bf035f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For cases (2) and (3) above -- could those also come from a check constraint?
Yes, if the column is NOT NULL
. Great catch. I've updated the commit and PR description.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @cucaroach)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realized that #78685 fixed #78681 for GenerateLookupJoins, but not for GenerateInvertedJoins. I've added another commit to this PR that fixes that issue.
Good catch! In that case will we also need to backport this to 21.1, though? And maybe worth updating the other TA to include this case?
Reviewed 3 of 3 files at r3, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @cucaroach and @mgartner)
-- commits, line 31 at r3:
nit: address -> addresses
pkg/sql/opt/xform/testdata/rules/join, line 9046 at r3 (raw file):
# GenerateInvertedJoins fires in this case because the semi join is converted to # an inner join by ConvertSemiToInnerJoin. opt
nit: should you add expect=GenerateInvertedJoins
?
In cockroachdb#78685, we prevented `GenerateLookupJoins` from incorrect creating a cross-join on the input of a semi-join, addressing cockroachdb#78681. This commit addresses the same issue with `GenerateInvertedJoins`, which we originally forgot to fix. Informs cockroachdb#78681 Release note (bug fix): A bug has been fixed which caused the optimizer to generate invalid query plans which could result in incorrect query results. The bug, which has been present since version 21.1.0, can appear if all of the following conditions are true: 1. The query contains a semi-join, such as queries in the form `SELECT * FROM a WHERE EXISTS (SELECT * FROM b WHERE a.a @> b.b)`. 2. The inner table has a multi-column inverted index containing the inverted column in the filter. 3. The index prefix columns are constrained to a set of values via the filter or a `CHECK` constraint, e.g., with an `IN` operator. In the case of a `CHECK` constraint, the column is `NOT NULL`.
e1c9619
to
f6ea408
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that case will we also need to backport this to 21.1, though?
Yes, just the second commit though. I've added the label.
And maybe worth updating the other TA to include this case?
Can we just update a TA? My concern is that the TA was already pushed out to customers and if we update it then some customers won't see the changes.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @cucaroach and @rytaft)
pkg/sql/opt/xform/testdata/rules/join, line 9046 at r3 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
nit: should you add
expect=GenerateInvertedJoins
?
We don't really care whether GenerateInvertedJoins
fires for the converted inner-join or not, only that it doesn't fire for the semi-join. So I've disabled ConvertSemiToInnerJoin
and added expect-not=GenerateInvertedJoins
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, just the second commit though. I've added the label.
Blathers will try to backport both commits for 21.1 (although I'm guessing it will fail on that branch anyway)
Can we just update a TA? My concern is that the TA was already pushed out to customers and if we update it then some customers won't see the changes.
Sounds like you've resolved this offline, but let me know otherwise.
Reviewed 1 of 1 files at r4, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @cucaroach)
I'll manually backport if that happens. TFTR! bors r+ |
Build succeeded: |
Encountered an error creating backports. Some common things that can go wrong:
You might need to create your backport manually using the backport tool. error creating merge commit from 3bf035f to blathers/backport-release-21.1-79389: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict [] you may need to manually resolve merge conflicts with the backport tool. Backport to branch 21.1.x failed. See errors above. error creating merge commit from 3bf035f to blathers/backport-release-21.2-79389: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict [] you may need to manually resolve merge conflicts with the backport tool. Backport to branch 21.2.x failed. See errors above. error creating merge commit from 3bf035f to blathers/backport-release-22.1-79389: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict [] you may need to manually resolve merge conflicts with the backport tool. Backport to branch 22.1.x failed. See errors above. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan. |
opt: do not generate unnecessary cross-joins on lookup join input
This commit fixes a bug that caused unnecessary cross-joins on the input
of lookup joins, causing both suboptimal query plans and incorrect query
results. The bug only affected lookup joins with lookup expressions.
Fixes #79384
Release note (bug fix): A bug has been fixed that caused the optimizer
to generate query plans with logically incorrect lookup joins. The bug
can only occur in queries with an inner join, e.g.,
t1 JOIN t2
, if allof the following are true:
tables, e.g.,
t1.a = t2.a
.CHECK
constraint constrains a column to a setof specific values, e.g.,
t2.b IN (1, 2, 3)
. In the case of aCHECK
constraint, the column must beNOT NULL
.CHECK
constraint constrains a column to arange, e.g.,
t2.c > 0
. In the case of aCHECK
constraint, thecolumn must be
NOT NULL
.INDEX t2(a, b, c)
.This bug has been present since version 21.2.0.
opt: do not cross-join input of inverted semi-join
In #78685, we prevented
GenerateLookupJoins
from incorrect creating across-join on the input of a semi-join, addressing #78681. This commit
addresses the same issue with
GenerateInvertedJoins
, which weoriginally forgot to fix.
Informs #78681
Release note (bug fix): A bug has been fixed which caused the optimizer
to generate invalid query plans which could result in incorrect query
results. The bug, which has been present since version 21.1.0, can
appear if all of the following conditions are true:
SELECT * FROM a WHERE EXISTS (SELECT * FROM b WHERE a.a @> b.b)
.inverted column in the filter.
filter or a
CHECK
constraint, e.g., with anIN
operator. In thecase of a
CHECK
constraint, the column isNOT NULL
.