Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

opt: do not generate unnecessary cross-joins on join input #79389

Merged
merged 2 commits into from
Apr 5, 2022

Conversation

mgartner
Copy link
Collaborator

@mgartner mgartner commented Apr 4, 2022

opt: do not generate unnecessary cross-joins on lookup join input

This commit fixes a bug that caused unnecessary cross-joins on the input
of lookup joins, causing both suboptimal query plans and incorrect query
results. The bug only affected lookup joins with lookup expressions.

Fixes #79384

Release note (bug fix): A bug has been fixed that caused the optimizer
to generate query plans with logically incorrect lookup joins. The bug
can only occur in queries with an inner join, e.g., t1 JOIN t2, if all
of the following are true:

  1. The join contains an equality condition between columns of both
    tables, e.g., t1.a = t2.a.
  2. A query filter or CHECK constraint constrains a column to a set
    of specific values, e.g., t2.b IN (1, 2, 3). In the case of a
    CHECK constraint, the column must be NOT NULL.
  3. A query filter or CHECK constraint constrains a column to a
    range, e.g., t2.c > 0. In the case of a CHECK constraint, the
    column must be NOT NULL.
  4. An index contains a column from each of the criteria above, e.g.,
    INDEX t2(a, b, c).
    This bug has been present since version 21.2.0.

opt: do not cross-join input of inverted semi-join

In #78685, we prevented GenerateLookupJoins from incorrect creating a
cross-join on the input of a semi-join, addressing #78681. This commit
addresses the same issue with GenerateInvertedJoins, which we
originally forgot to fix.

Informs #78681

Release note (bug fix): A bug has been fixed which caused the optimizer
to generate invalid query plans which could result in incorrect query
results. The bug, which has been present since version 21.1.0, can
appear if all of the following conditions are true:

  1. The query contains a semi-join, such as queries in the form
    SELECT * FROM a WHERE EXISTS (SELECT * FROM b WHERE a.a @> b.b).
  2. The inner table has a multi-column inverted index containing the
    inverted column in the filter.
  3. The index prefix columns are constrained to a set of values via the
    filter or a CHECK constraint, e.g., with an IN operator. In the
    case of a CHECK constraint, the column is NOT NULL.

@mgartner mgartner requested review from rytaft, cucaroach and a team April 4, 2022 21:46
@mgartner mgartner requested a review from a team as a code owner April 4, 2022 21:46
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@mgartner mgartner force-pushed the 79384-fix-lookup-join-expr branch from e9b2de5 to f01e2f9 Compare April 4, 2022 21:48
Copy link
Collaborator

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 9 of 9 files at r1, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @cucaroach)

@rytaft
Copy link
Collaborator

rytaft commented Apr 4, 2022

For cases (2) and (3) above -- could those also come from a check constraint?

This commit fixes a bug that caused unnecessary cross-joins on the input
of lookup joins, causing both suboptimal query plans and incorrect query
results. The bug only affected lookup joins with lookup expressions.

Fixes cockroachdb#79384

Release note (bug fix): A bug has been fixed that caused the optimizer
to generate query plans with logically incorrect lookup joins. The bug
can only occur in queries with an inner join, e.g., `t1 JOIN t2`, if all
of the following are true:
  1. The join contains an equality condition between columns of both
     tables, e.g., `t1.a = t2.a`.
  2. A query filter or `CHECK` constraint constrains a column to a set
     of specific values, e.g., `t2.b IN (1, 2, 3)`. In the case of a
     `CHECK` constraint, the column must be `NOT NULL`.
  3. A query filter or `CHECK` constraint constrains a column to a
     range, e.g., `t2.c > 0`. In the case of a `CHECK` constraint, the
     column must be `NOT NULL`.
  4. An index contains a column from each of the criteria above, e.g.,
     `INDEX t2(a, b, c)`.
This bug has been present since version 21.2.0.
@mgartner mgartner force-pushed the 79384-fix-lookup-join-expr branch from f01e2f9 to 3bf035f Compare April 5, 2022 13:09
Copy link
Collaborator Author

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For cases (2) and (3) above -- could those also come from a check constraint?

Yes, if the column is NOT NULL. Great catch. I've updated the commit and PR description.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @cucaroach)

@mgartner mgartner changed the title opt: do not generate unnecessary cross-joins on lookup join input opt: do not generate unnecessary cross-joins on join input Apr 5, 2022
@mgartner
Copy link
Collaborator Author

mgartner commented Apr 5, 2022

I realized that #78685 fixed #78681 for GenerateLookupJoins, but not for GenerateInvertedJoins. I've added another commit to this PR that fixes that issue.

Copy link
Collaborator

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realized that #78685 fixed #78681 for GenerateLookupJoins, but not for GenerateInvertedJoins. I've added another commit to this PR that fixes that issue.

Good catch! In that case will we also need to backport this to 21.1, though? And maybe worth updating the other TA to include this case?

Reviewed 3 of 3 files at r3, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @cucaroach and @mgartner)


-- commits, line 31 at r3:
nit: address -> addresses


pkg/sql/opt/xform/testdata/rules/join, line 9046 at r3 (raw file):

# GenerateInvertedJoins fires in this case because the semi join is converted to
# an inner join by ConvertSemiToInnerJoin.
opt

nit: should you add expect=GenerateInvertedJoins?

In cockroachdb#78685, we prevented `GenerateLookupJoins` from incorrect creating a
cross-join on the input of a semi-join, addressing cockroachdb#78681. This commit
addresses the same issue with `GenerateInvertedJoins`, which we
originally forgot to fix.

Informs cockroachdb#78681

Release note (bug fix): A bug has been fixed which caused the optimizer
to generate invalid query plans which could result in incorrect query
results. The bug, which has been present since version 21.1.0, can
appear if all of the following conditions are true:
  1. The query contains a semi-join, such as queries in the form
     `SELECT * FROM a WHERE EXISTS (SELECT * FROM b WHERE a.a @> b.b)`.
  2. The inner table has a multi-column inverted index containing the
     inverted column in the filter.
  3. The index prefix columns are constrained to a set of values via the
     filter or a `CHECK` constraint, e.g., with an `IN` operator. In the
     case of a `CHECK` constraint, the column is `NOT NULL`.
@mgartner mgartner force-pushed the 79384-fix-lookup-join-expr branch from e1c9619 to f6ea408 Compare April 5, 2022 20:00
Copy link
Collaborator Author

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case will we also need to backport this to 21.1, though?

Yes, just the second commit though. I've added the label.

And maybe worth updating the other TA to include this case?

Can we just update a TA? My concern is that the TA was already pushed out to customers and if we update it then some customers won't see the changes.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @cucaroach and @rytaft)


pkg/sql/opt/xform/testdata/rules/join, line 9046 at r3 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

nit: should you add expect=GenerateInvertedJoins?

We don't really care whether GenerateInvertedJoins fires for the converted inner-join or not, only that it doesn't fire for the semi-join. So I've disabled ConvertSemiToInnerJoin and added expect-not=GenerateInvertedJoins.

Copy link
Collaborator

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Yes, just the second commit though. I've added the label.

Blathers will try to backport both commits for 21.1 (although I'm guessing it will fail on that branch anyway)

Can we just update a TA? My concern is that the TA was already pushed out to customers and if we update it then some customers won't see the changes.

Sounds like you've resolved this offline, but let me know otherwise.

Reviewed 1 of 1 files at r4, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @cucaroach)

@mgartner
Copy link
Collaborator Author

mgartner commented Apr 5, 2022

Blathers will try to backport both commits for 21.1 (although I'm guessing it will fail on that branch anyway)

I'll manually backport if that happens. TFTR!

bors r+

@craig
Copy link
Contributor

craig bot commented Apr 5, 2022

Build succeeded:

@craig craig bot merged commit b84398d into cockroachdb:master Apr 5, 2022
@blathers-crl
Copy link

blathers-crl bot commented Apr 5, 2022

Encountered an error creating backports. Some common things that can go wrong:

  1. The backport branch might have already existed.
  2. There was a merge conflict.
  3. The backport branch contained merge commits.

You might need to create your backport manually using the backport tool.


error creating merge commit from 3bf035f to blathers/backport-release-21.1-79389: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict []

you may need to manually resolve merge conflicts with the backport tool.

Backport to branch 21.1.x failed. See errors above.


error creating merge commit from 3bf035f to blathers/backport-release-21.2-79389: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict []

you may need to manually resolve merge conflicts with the backport tool.

Backport to branch 21.2.x failed. See errors above.


error creating merge commit from 3bf035f to blathers/backport-release-22.1-79389: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict []

you may need to manually resolve merge conflicts with the backport tool.

Backport to branch 22.1.x failed. See errors above.


🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

opt: unnecessary cross-joins with when generating a lookup join with a lookup expression
3 participants