-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
opt: do not cross-join input of semi-join #78685
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @mgartner and @rytaft)
pkg/sql/opt/xform/join_funcs.go, line 461 at r1 (raw file):
if joinType == opt.LeftJoinOp || joinType == opt.SemiJoinOp || joinType == opt.AntiJoinOp { // We cannot use the method constructJoinWithConstants to create a cross // join for left or anti joins, because constructing a cross join with
nit: also add semi join to this comment
Code quote:
// join for left or anti joins, because constructing a cross join with
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @mgartner and @rytaft)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: formatting of the release note is a little messed up (missing backtick maybe?)
Reviewable status: complete! 2 of 0 LGTMs obtained (waiting on @mgartner and @rytaft)
68d87ef
to
60cc2a4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: formatting of the release note is a little messed up (missing backtick maybe?)
Fixed.
I also added a logic test.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @rytaft)
pkg/sql/opt/xform/join_funcs.go, line 461 at r1 (raw file):
Previously, michae2 (Michael Erickson) wrote…
nit: also add semi join to this comment
Done.
60cc2a4
to
98a4d31
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 2 files at r1, 4 of 4 files at r2, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (and 2 stale) (waiting on @mgartner)
-- commits, line 19 at r2:
nit: this makes it sound a bit like it can happen if any of them are true. I'd say "can appear if all of the following conditions are true"
-- commits, line 21 at r2:
nit: t2.a`); -> t2.a);` (fix in the PR description too)
This commit fixes a logical correctness bug caused when `GenerateLookupJoins` cross-joins the input of a semi-join with a set of constant values to constrain the prefix columns of the lookup index. The cross-join is an invalid transformation because it increases the size of the join's input and can increase the size of the join's output. We already avoid these cross-joins for left and anti-joins (see cockroachdb#59646). When addressing those cases, the semi-join case was incorrectly assumed to be safe. Fixes cockroachdb#78681 Release note (bug fix): A bug has been fixed which caused the optimizer to generate invalid query plans which could result in incorrect query results. The bug, which has been present since version 21.1.0, can appear if all of the following conditions are true: 1) the query contains a semi-join, such as queries in the form: `SELECT * FROM t1 WHERE EXISTS (SELECT * FROM t2 WHERE t1.a = t2.a);`, 2) the inner table has an index containing the equality column, like `t2.a` in the example query, 3) the index contains one or more columns that prefix the equality column, and 4) the prefix columns are `NOT NULL` and are constrained to a set of constant values via a `CHECK` constraint or an `IN` condition in the filter.
98a4d31
to
1d7811d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 3 stale) (waiting on @mgartner)
Previously, rytaft (Rebecca Taft) wrote…
nit: this makes it sound a bit like it can happen if any of them are true. I'd say "can appear if all of the following conditions are true"
Done.
Previously, rytaft (Rebecca Taft) wrote…
nit: t2.a`); -> t2.a);` (fix in the PR description too)
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 3 stale) (waiting on @mgartner)
TFTRs! bors r+ |
PR cockroachdb#78685 changes the query plan of one query in a logic test in a way that makes the test flakey. This commit guarantees that the test cannot be flakey, regardless of the query plan. Release note: None
PR cockroachdb#78685 changes the query plan of one query in a logic test in a way that makes the test flakey. This commit guarantees that the test cannot be flakey, regardless of the query plan. Release note: None
PR cockroachdb#78685 changes the query plan of one query in a logic test in a way that makes the test flakey. This commit guarantees that the test cannot be flakey, regardless of the query plan. Release note: None
PR cockroachdb#78685 changes the query plan of one query in a logic test in a way that makes the test flakey. This commit guarantees that the test cannot be flakey, regardless of the query plan. Release note: None
PR cockroachdb#78685 changes the query plan of one query in a logic test in a way that makes the test flakey. This commit guarantees that the test cannot be flakey, regardless of the query plan. Release note: None
Build succeeded: |
Encountered an error creating backports. Some common things that can go wrong:
You might need to create your backport manually using the backport tool. error creating merge commit from 1d7811d to blathers/backport-release-21.1-78685: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict [] you may need to manually resolve merge conflicts with the backport tool. Backport to branch 21.1.x failed. See errors above. error creating merge commit from 1d7811d to blathers/backport-release-21.2-78685: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict [] you may need to manually resolve merge conflicts with the backport tool. Backport to branch 21.2.x failed. See errors above. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan. |
78704: sql: propagate limit for top K sort correctly in tests r=yuzefovich a=yuzefovich In 22.1 time frame we started propagating the value of K for top K sort in the spec of the processor, and not in the post-processing spec, but we forgot to update some of the tests accordingly. Informs: #78592. Release note: None 78949: kvserver: gossip l0sublevels instead of read amp r=kvoli a=kvoli Previously read amplification was gossipped among stores to enable future allocation decisions that would avoid candidates with high read amplification. L0 Sublevels represents the number of levels with L0 and is a portion of read amplification. This patch change read amplification to l0 sublevels, as it is a better indicator of store health. Release justification: low risk, replace deprecated gossip signal. Release note: None 78984: sql: deflake unique logic test r=mgartner a=mgartner PR #78685 changes the query plan of one query in a logic test in a way that makes the test flakey. This commit guarantees that the test cannot be flakey, regardless of the query plan. Release note: None Co-authored-by: Yahor Yuzefovich <[email protected]> Co-authored-by: Austen McClernon <[email protected]> Co-authored-by: Marcus Gartner <[email protected]>
PR cockroachdb#78685 changes the query plan of one query in a logic test in a way that makes the test flakey. This commit guarantees that the test cannot be flakey, regardless of the query plan. Release note: None
In cockroachdb#78685, we prevented `GenerateLookupJoins` from incorrect creating a cross-join on the input of a semi-join, addressing cockroachdb#78681. This commit address the same issue with `GenerateInvertedJoins`, which we originally forgot to fix. Informs cockroachdb#78681 Release note (bug fix): A bug has been fixed which caused the optimizer to generate invalid query plans which could result in incorrect query results. The bug, which has been present since version 21.1.0, can appear if all of the following conditions are true: 1. The query contains a semi-join, such as queries in the form `SELECT * FROM a WHERE EXISTS (SELECT * FROM b WHERE a.a @> b.b)`. 2. The inner table has a multi-column inverted index containing the inverted column in the filter. 3. The index prefix columns are constrained to a set of values via the filter or a `CHECK` constraint, e.g., with an `IN` operator. In the case of a `CHECK` constraint, the column is `NOT NULL`.
In cockroachdb#78685, we prevented `GenerateLookupJoins` from incorrect creating a cross-join on the input of a semi-join, addressing cockroachdb#78681. This commit addresses the same issue with `GenerateInvertedJoins`, which we originally forgot to fix. Informs cockroachdb#78681 Release note (bug fix): A bug has been fixed which caused the optimizer to generate invalid query plans which could result in incorrect query results. The bug, which has been present since version 21.1.0, can appear if all of the following conditions are true: 1. The query contains a semi-join, such as queries in the form `SELECT * FROM a WHERE EXISTS (SELECT * FROM b WHERE a.a @> b.b)`. 2. The inner table has a multi-column inverted index containing the inverted column in the filter. 3. The index prefix columns are constrained to a set of values via the filter or a `CHECK` constraint, e.g., with an `IN` operator. In the case of a `CHECK` constraint, the column is `NOT NULL`.
77742: sql: implement SHOW [ALL] CLUSTER SETTINGS FOR TENANT r=rafiss a=knz All commits but the last 2 from #77740. (Reviewers: only the last 2 commits belong to this PR.) Informs #77471 Release justification: low risk, high benefit changes to existing functionality 79260: changefeedccl, backupresolver: refactor to hold on to mapping of target to descriptor r=[miretskiy,dt] a=HonoreDB Changefeed statements need to resolve a bunch of table names at once, but unlike backups and grants they need to know which returned descriptor corresponded to which input because they (now) take target-specific options. We were reconstructing this awkwardly on the calling side. This PR adds an optional parameter to the backupresolver method being used so that it can track which descriptor belongs to which input. I'm probably being overly polite by making this optional, but hey, it is a little extra memory footprint and not my package. Release note: None 79324: changefeedccl: unify initial_scan option syntax r=sherman-grewal a=sherman-grewal Resolves #79324 Currently, we have explicit options for each possible behaviour that a user would like to achieve for initial scans on changefeeds. For instance, a user could specify: - initial_scan - no_initial_scan - initial_scan_only This seems a bit sprawling, and can inadvertently cause contradictions in a changefeed statement. Hence, in this PR we extend the option `initial_scan` to take on three possible values: `'yes|no|only'`. Once this change is made we will remove the explicit options from the docs, but we will keep these options for backwards compatibility. Release note (enterprise change): Unify the syntax that allows users to define the behaviour they would like for initial scans on changefeeds by extending the `initial_scan` option to take on three possible values: `'yes|no|only'`. Release justification: Small, safe refactor that will improve the user experience when creating changefeeds. Jira issue: CRDB-14693 79389: opt: do not generate unnecessary cross-joins on join input r=mgartner a=mgartner #### opt: do not generate unnecessary cross-joins on lookup join input This commit fixes a bug that caused unnecessary cross-joins on the input of lookup joins, causing both suboptimal query plans and incorrect query results. The bug only affected lookup joins with lookup expressions. Fixes #79384 Release note (bug fix): A bug has been fixed that caused the optimizer to generate query plans with logically incorrect lookup joins. The bug can only occur in queries with an inner join, e.g., `t1 JOIN t2`, if all of the following are true: 1. The join contains an equality condition between columns of both tables, e.g., `t1.a = t2.a`. 2. A query filter or `CHECK` constraint constrains a column to a set of specific values, e.g., `t2.b IN (1, 2, 3)`. In the case of a `CHECK` constraint, the column must be `NOT NULL`. 3. A query filter or `CHECK` constraint constrains a column to a range, e.g., `t2.c > 0`. In the case of a `CHECK` constraint, the column must be `NOT NULL`. 4. An index contains a column from each of the criteria above, e.g., `INDEX t2(a, b, c)`. This bug has been present since version 21.2.0. #### opt: do not cross-join input of inverted semi-join In #78685, we prevented `GenerateLookupJoins` from incorrect creating a cross-join on the input of a semi-join, addressing #78681. This commit addresses the same issue with `GenerateInvertedJoins`, which we originally forgot to fix. Informs #78681 Release note (bug fix): A bug has been fixed which caused the optimizer to generate invalid query plans which could result in incorrect query results. The bug, which has been present since version 21.1.0, can appear if all of the following conditions are true: 1. The query contains a semi-join, such as queries in the form `SELECT * FROM a WHERE EXISTS (SELECT * FROM b WHERE a.a `@>` b.b)`. 2. The inner table has a multi-column inverted index containing the inverted column in the filter. 3. The index prefix columns are constrained to a set of values via the filter or a `CHECK` constraint, e.g., with an `IN` operator. In the case of a `CHECK` constraint, the column is `NOT NULL`. 79454: docs: update alter changefeed diagram r=ericharmeling a=kathancox Release note: None Co-authored-by: Raphael 'kena' Poss <[email protected]> Co-authored-by: Aaron Zinger <[email protected]> Co-authored-by: Sherman Grewal <[email protected]> Co-authored-by: Marcus Gartner <[email protected]> Co-authored-by: Kathryn Hancox <[email protected]>
In cockroachdb#78685, we prevented `GenerateLookupJoins` from incorrect creating a cross-join on the input of a semi-join, addressing cockroachdb#78681. This commit addresses the same issue with `GenerateInvertedJoins`, which we originally forgot to fix. Informs cockroachdb#78681 Release note (bug fix): A bug has been fixed which caused the optimizer to generate invalid query plans which could result in incorrect query results. The bug, which has been present since version 21.1.0, can appear if all of the following conditions are true: 1. The query contains a semi-join, such as queries in the form `SELECT * FROM a WHERE EXISTS (SELECT * FROM b WHERE a.a @> b.b)`. 2. The inner table has a multi-column inverted index containing the inverted column in the filter. 3. The index prefix columns are constrained to a set of values via the filter or a `CHECK` constraint, e.g., with an `IN` operator. In the case of a `CHECK` constraint, the column is `NOT NULL`.
In cockroachdb#78685, we prevented `GenerateLookupJoins` from incorrect creating a cross-join on the input of a semi-join, addressing cockroachdb#78681. This commit addresses the same issue with `GenerateInvertedJoins`, which we originally forgot to fix. Informs cockroachdb#78681 Release note (bug fix): A bug has been fixed which caused the optimizer to generate invalid query plans which could result in incorrect query results. The bug, which has been present since version 21.1.0, can appear if all of the following conditions are true: 1. The query contains a semi-join, such as queries in the form `SELECT * FROM a WHERE EXISTS (SELECT * FROM b WHERE a.a @> b.b)`. 2. The inner table has a multi-column inverted index containing the inverted column in the filter. 3. The index prefix columns are constrained to a set of values via the filter or a `CHECK` constraint, e.g., with an `IN` operator. In the case of a `CHECK` constraint, the column is `NOT NULL`.
In #78685, we prevented `GenerateLookupJoins` from incorrect creating a cross-join on the input of a semi-join, addressing #78681. This commit addresses the same issue with `GenerateInvertedJoins`, which we originally forgot to fix. Informs #78681 Release note (bug fix): A bug has been fixed which caused the optimizer to generate invalid query plans which could result in incorrect query results. The bug, which has been present since version 21.1.0, can appear if all of the following conditions are true: 1. The query contains a semi-join, such as queries in the form `SELECT * FROM a WHERE EXISTS (SELECT * FROM b WHERE a.a @> b.b)`. 2. The inner table has a multi-column inverted index containing the inverted column in the filter. 3. The index prefix columns are constrained to a set of values via the filter or a `CHECK` constraint, e.g., with an `IN` operator. In the case of a `CHECK` constraint, the column is `NOT NULL`.
This commit fixes a logical correctness bug caused when
GenerateLookupJoins
cross-joins the input of a semi-join with a set ofconstant values to constrain the prefix columns of the lookup index. The
cross-join is an invalid transformation because it increases the size of
the join's input and can increase the size of the join's output.
We already avoid these cross-joins for left and anti-joins (see #59646).
When addressing those cases, the semi-join case was incorrectly assumed
to be safe.
Fixes #78681
Release note (bug fix): A bug has been fixed which caused the optimizer
to generate invalid query plans which could result in incorrect query
results. The bug, which has been present since version 21.1.0, can
appear if all of the following conditions are true: 1) the query
contains a semi-join, such as queries in the form:
SELECT * FROM t1 WHERE EXISTS (SELECT * FROM t2 WHERE t1.a = t2.a);
,2) the inner table has an index containing the equality column, like
t2.a
in the example query, 3) the index contains one or morecolumns that prefix the equality column, and 4) the prefix columns are
NOT NULL
and are constrained to a set of constant values via aCHECK
constraint or an
IN
condition in the filter.