-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
opt: do not plan unnecessary paired semi- and anti- lookup joins #88491
Conversation
// anti-joins because they never include columns from the right side | ||
// in their output columns. Other joins include columns from the | ||
// right side in their output columns, so even if the ON filters no | ||
// longer reference an un-covered column, they must be fetched (case | ||
// 2, see function comment). | ||
filterColsFromRight := rightCols.Intersection(onFilters.OuterCols()) | ||
remainingFilterCols := rightKeyCols.Union(lookupJoin.On.OuterCols()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I think I might need to include outer columns of the lookup expression here too...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is right... only the ON condition should need to see the right-side rows, while the lookup condition is only used to generate spans.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea makes sense, but your comment below is undermining my confidence a bit haha 🙂
Reviewed 2 of 3 files at r1, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @mgartner and @rytaft)
pkg/sql/opt/xform/join_funcs.go
line 544 at r1 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
Hmm, I think I might need to include outer columns of the lookup expression here too...
I'm having trouble understanding the various columnsets in play here, and this comment is making me a little scared 🙂
- Is
rightKeyCols
the columns from the join conditions that are part of the lookup index key? - What are the outer columns of the lookup expression? Are those different from
lookupJoin.On.OuterCols()
? - What is
onFilters.OuterCols()
and how is it different fromlookupJoin.On.OuterCols()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 3 of 3 files at r1, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @DrewKimball, @mgartner, and @michae2)
pkg/sql/opt/xform/join_funcs.go
line 538 at r1 (raw file):
// the secondary index not covering y. // // Note that these are special cases that only works for semi- and
nit: only works -> only work
pkg/sql/opt/xform/join_funcs.go
line 544 at r1 (raw file):
Previously, DrewKimball (Drew Kimball) wrote…
I think this is right... only the ON condition should need to see the right-side rows, while the lookup condition is only used to generate spans.
You have this comment above RightSideCols
:
// RightSideCols is an ordered list of prefix index columns that are
// constrained by this constraint. It corresponds 1:1 with the columns in
// KeyCols if KeyCols is non-nil. Otherwise, it includes the prefix of index
// columns constrained by LookupExpr.
Based on the last sentence there, I think this code is correct as-is (what @DrewKimball said).
This commit fixes an issue where the optimizer would plan a paired semi or anti lookup join in cases when a single lookup join would suffice. This only occurred in rare cases when the join filter contained a tautology or contradiction that could not be normalized to true or false in the canonical query plan, but could be eliminated from the filters when building a lookup join. If the tautology or contradiction referenced a column not covered by the lookup index, the optimizer mistakenly assumed that the index was not covering and planned a paired join. Now the optimizer can recognize that the index is actually covering, because the referenced column is not needed to evaluate the filters, and a single lookup join is planned. Fixes cockroachdb#87306 Release note (performance improvement): The optimizer now explores plans with a single lookup join expressions in rare cases where it previously planned two lookup join expressions.
27302db
to
7af0002
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @DrewKimball, @michae2, and @rytaft)
pkg/sql/opt/xform/join_funcs.go
line 538 at r1 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
nit: only works -> only work
Done.
pkg/sql/opt/xform/join_funcs.go
line 544 at r1 (raw file):
Ok good, I agree that this looks fine as-is.
Is rightKeyCols the columns from the join conditions that are part of the lookup index key?
rightKeyCols
contains the columns in the lookup index that are constrained by either the key columns or the lookup expression.
What are the outer columns of the lookup expression? Are those different from lookupJoin.On.OuterCols()?
Outer columns of the lookup expression all come from the left side of the join. As Drew pointed out, the column values of each input row are substituted in the lookup expression to generate lookup spans.
What is onFilters.OuterCols() and how is it different from lookupJoin.On.OuterCols()?
This one is tricky. There's 3 versions of the on filters, each one slightly different:
on
- These are the original ON filters passed as an argument to the function.onFilters
- These are the same ason
unless the lookup index is a partial index. In that case,onFilters
may be a subset ofon
which has been reduced because they match the partial index predicate. For example, if the original ON condition isON t1.id = t2.id AND t2.deleted = false
and there existsINDEX t2(id) WHERE deleted = false
, thenon
would bet1.id = t2.id AND t2.deleted = false
andonFilters
would bet1.id = t2.id
. Thet2.deleted = false
no longer needs to be applied explicitly because a lookup into the partial index implicitly applies that filter.lookupJoin.On
- These are the remaining filters after the lookup constraint has been built. They are the same aslookupConstraint.RemainingFilters
:
// RemainingFilters contains explicit ON filters that are not represented by
// KeyCols or LookupExpr. These filters must be included as ON filters in
// the lookup join.
In order words, these filters could not be part of the lookup spans themselves, so they must be applied after the lookup is performed.
It might be good to backport this into v22.2.1 so I've added the label. |
TFTRs! bors r+ |
Build succeeded: |
This commit fixes an issue where the optimizer would plan a paired semi
or anti lookup join in cases when a single lookup join would suffice.
This only occurred in rare cases when the join filter contained a
tautology or contradiction that could not be normalized to true or false
in the canonical query plan, but could be eliminated from the filters
when building a lookup join. If the tautology or contradiction
referenced a column not covered by the lookup index, the optimizer
mistakenly assumed that the index was not covering and planned a paired
join. Now the optimizer can recognize that the index is actually
covering, because the referenced column is not needed to evaluate the
filters, and a single lookup join is planned.
Fixes #87306
Release note (performance improvement): The optimizer now explores plans
with a single lookup join expressions in rare cases where it previously
planned two lookup join expressions.