Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

opt: fix bug with incorrect results returned by left lookup/inverted join #59646

Merged
merged 1 commit into from
Feb 1, 2021

Conversation

rytaft
Copy link
Collaborator

@rytaft rytaft commented Feb 1, 2021

This commit fixes a bug in which incorrect results could be returned for left
and anti lookup and inverted joins. This could happen when one of the prefix
columns was constrained to more than one constant value, either due to a check
constraint or an IN clause. The current implementation used to support lookup
and inverted joins for this case creates a cross-join between the constant
values and the input, thus artifically increasing the size of the input to the
join. This works for inner and semi joins, but it is not correct for left or
anti joins, because non-matching input rows will be returned multiple times in
the output, which is incorrect.

This commit fixes the bug by avoiding creating lookup or inverted joins in
these cases. However, this is suboptimal for performance, and in a future
commit we should try to find a way to create left/anti inverted/lookup joins
safely for these cases.

Fixes #59615

Release note (bug fix): Fixed a bug in which incorrect results could be
returned for left and anti joins. This could happen when one of the columns on
one side of the join was constrained to multiple constant values, either due to
a check constraint or an IN clause. The bug resulted in non-matching input rows
getting returned multiple times in the output, which is incorrect. This bug
only affected previous alpha releases of 21.1, and has now been fixed.

@rytaft rytaft requested a review from a team as a code owner February 1, 2021 15:09
@cockroach-teamcity
Copy link
Member

This change is Reviewable

…join

This commit fixes a bug in which incorrect results could be returned for left
and anti lookup and inverted joins. This could happen when one of the prefix
columns was constrained to more than one constant value, either due to a check
constraint or an IN clause. The current implementation used to support lookup
and inverted joins for this case creates a cross-join between the constant
values and the input, thus artifically increasing the size of the input to the
join. This works for inner and semi joins, but it is not correct for left or
anti joins, because non-matching input rows will be returned multiple times in
the output, which is incorrect.

This commit fixes the bug by avoiding creating lookup or inverted joins in
these cases. However, this is suboptimal for performance, and in a future
commit we should try to find a way to create left/anti inverted/lookup joins
safely for these cases.

Fixes cockroachdb#59615

Release note (bug fix): Fixed a bug in which incorrect results could be
returned for left and anti joins. This could happen when one of the columns on
one side of the join was constrained to multiple constant values, either due to
a check constraint or an IN clause. The bug resulted in non-matching input rows
getting returned multiple times in the output, which is incorrect. This bug
only affected previous alpha releases of 21.1, and has now been fixed.
Copy link
Collaborator

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing this! ❤️ :lgtm:

Reviewed 5 of 5 files at r1.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @RaduBerinde)

@rytaft
Copy link
Collaborator Author

rytaft commented Feb 1, 2021

TFTR!

bors r+

@craig
Copy link
Contributor

craig bot commented Feb 1, 2021

Build succeeded:

@craig craig bot merged commit 03a06b0 into cockroachdb:master Feb 1, 2021
mgartner added a commit to mgartner/cockroach that referenced this pull request Mar 28, 2022
This commit fixes a logical correctness bug caused when
`GenerateLookupJoins` cross-joins the input of a semi-join with a set of
constant values to constrain the prefix columns of the lookup index. The
cross-join is an invalid transformation because it increases the size of
the join's input and can increase the size of the join's output.

We already avoid these cross-joins for left and anti-joins (see cockroachdb#59646).
When addressing those cases, the semi-join case was incorrectly assumed
to be safe.

Fixes cockroachdb#78681

Release note (bug fix): A bug has been fixed which caused the optimizer
to generate invalid query plans which could result in incorrect query
results. The bug, which has been present since version 21.1.0, can
appear under all of the following conditions: 1) the query contains a
semi-join, such as queries in the form:
`SELECT * FROM t WHERE EXISTS (SELECT * FROM t2 WHERE t.a = t2.a`)`;
2) the inner table had an index containing the equality column, like
`t2.a` in the example query; 3) the index contained one or more
columns that prefix the equality column; and 4) the prefix columns are
`NOT NULL` and are constrained to a set of constant values via a `CHECK`
constraint or an `IN` condition in the filter.
mgartner added a commit to mgartner/cockroach that referenced this pull request Mar 29, 2022
This commit fixes a logical correctness bug caused when
`GenerateLookupJoins` cross-joins the input of a semi-join with a set of
constant values to constrain the prefix columns of the lookup index. The
cross-join is an invalid transformation because it increases the size of
the join's input and can increase the size of the join's output.

We already avoid these cross-joins for left and anti-joins (see cockroachdb#59646).
When addressing those cases, the semi-join case was incorrectly assumed
to be safe.

Fixes cockroachdb#78681

Release note (bug fix): A bug has been fixed which caused the optimizer
to generate invalid query plans which could result in incorrect query
results. The bug, which has been present since version 21.1.0, can
appear under all of the following conditions: 1) the query contains a
semi-join, such as queries in the form:
`SELECT * FROM t1 WHERE EXISTS (SELECT * FROM t2 WHERE t1.a = t2.a`);
2) the inner table has an index containing the equality column, like
`t2.a` in the example query; 3) the index contains one or more
columns that prefix the equality column; and 4) the prefix columns are
`NOT NULL` and are constrained to a set of constant values via a `CHECK`
constraint or an `IN` condition in the filter.
mgartner added a commit to mgartner/cockroach that referenced this pull request Mar 29, 2022
This commit fixes a logical correctness bug caused when
`GenerateLookupJoins` cross-joins the input of a semi-join with a set of
constant values to constrain the prefix columns of the lookup index. The
cross-join is an invalid transformation because it increases the size of
the join's input and can increase the size of the join's output.

We already avoid these cross-joins for left and anti-joins (see cockroachdb#59646).
When addressing those cases, the semi-join case was incorrectly assumed
to be safe.

Fixes cockroachdb#78681

Release note (bug fix): A bug has been fixed which caused the optimizer
to generate invalid query plans which could result in incorrect query
results. The bug, which has been present since version 21.1.0, can
appear if all of the following conditions are true: 1) the query
contains a semi-join, such as queries in the form:
`SELECT * FROM t1 WHERE EXISTS (SELECT * FROM t2 WHERE t1.a = t2.a);`,
2) the inner table has an index containing the equality column, like
`t2.a` in the example query, 3) the index contains one or more
columns that prefix the equality column, and 4) the prefix columns are
`NOT NULL` and are constrained to a set of constant values via a `CHECK`
constraint or an `IN` condition in the filter.
mgartner added a commit to mgartner/cockroach that referenced this pull request Mar 29, 2022
This commit fixes a logical correctness bug caused when
`GenerateLookupJoins` cross-joins the input of a semi-join with a set of
constant values to constrain the prefix columns of the lookup index. The
cross-join is an invalid transformation because it increases the size of
the join's input and can increase the size of the join's output.

We already avoid these cross-joins for left and anti-joins (see cockroachdb#59646).
When addressing those cases, the semi-join case was incorrectly assumed
to be safe.

Fixes cockroachdb#78681

Release note (bug fix): A bug has been fixed which caused the optimizer
to generate invalid query plans which could result in incorrect query
results. The bug, which has been present since version 21.1.0, can
appear if all of the following conditions are true: 1) the query
contains a semi-join, such as queries in the form:
`SELECT * FROM t1 WHERE EXISTS (SELECT * FROM t2 WHERE t1.a = t2.a);`,
2) the inner table has an index containing the equality column, like
`t2.a` in the example query, 3) the index contains one or more
columns that prefix the equality column, and 4) the prefix columns are
`NOT NULL` and are constrained to a set of constant values via a `CHECK`
constraint or an `IN` condition in the filter.
mgartner added a commit to mgartner/cockroach that referenced this pull request Mar 29, 2022
This commit fixes a logical correctness bug caused when
`GenerateLookupJoins` cross-joins the input of a semi-join with a set of
constant values to constrain the prefix columns of the lookup index. The
cross-join is an invalid transformation because it increases the size of
the join's input and can increase the size of the join's output.

We already avoid these cross-joins for left and anti-joins (see cockroachdb#59646).
When addressing those cases, the semi-join case was incorrectly assumed
to be safe.

Fixes cockroachdb#78681

Release note (bug fix): A bug has been fixed which caused the optimizer
to generate invalid query plans which could result in incorrect query
results. The bug, which has been present since version 21.1.0, can
appear if all of the following conditions are true: 1) the query
contains a semi-join, such as queries in the form:
`SELECT * FROM t1 WHERE EXISTS (SELECT * FROM t2 WHERE t1.a = t2.a);`,
2) the inner table has an index containing the equality column, like
`t2.a` in the example query, 3) the index contains one or more
columns that prefix the equality column, and 4) the prefix columns are
`NOT NULL` and are constrained to a set of constant values via a `CHECK`
constraint or an `IN` condition in the filter.
craig bot pushed a commit that referenced this pull request Mar 29, 2022
78418: workload: support fixture import in multi-tenant mode r=nvanbenschoten a=nvanbenschoten

Fixes #75449.

This commit fixes fixture import into tenants, which do not have access to the `crdb_internal.gossip_liveness` table. This fixes the following:
- `[cockroach] workload fixtures import`
- `[cockroach] workload init --data-loader=IMPORT`

Release justification: low-risk change that improves workload functionality.

78626: backupccl: add deprecation warning for SHOW BACKUP without subdir r=adityamaru,benbardin a=msbutler

Informs: #78153

Release note (sql change): The `SHOW BACKUP` cmd  without the `IN` keyword to
specify a subdirectory is deprecated and will be removed from a future release.
Users are recommended to only create collection based backups and view them
with `SHOW BACKUP FROM <backup> IN <collection>`

78653: sql: add logging for grant_role r=RichardJCai a=RichardJCai

GRANT ROLE now logs to the logpb.Channel_USER_ADMIN channel.
Example: GRANT role1, role2 TO role3, role4 logs
"RoleNames":"‹role1›",‹role2›","Members":"‹role3›",‹ role4›"

Release note: None

Fixes #78245

78685: opt: do not cross-join input of semi-join r=mgartner a=mgartner

This commit fixes a logical correctness bug caused when
`GenerateLookupJoins` cross-joins the input of a semi-join with a set of
constant values to constrain the prefix columns of the lookup index. The
cross-join is an invalid transformation because it increases the size of
the join's input and can increase the size of the join's output.

We already avoid these cross-joins for left and anti-joins (see #59646).
When addressing those cases, the semi-join case was incorrectly assumed
to be safe.

Fixes #78681

Release note (bug fix): A bug has been fixed which caused the optimizer
to generate invalid query plans which could result in incorrect query
results. The bug, which has been present since version 21.1.0, can
appear if all of the following conditions are true: 1) the query
contains a semi-join, such as queries in the form:
`SELECT * FROM t1 WHERE EXISTS (SELECT * FROM t2 WHERE t1.a = t2.a);`,
2) the inner table has an index containing the equality column, like
`t2.a` in the example query, 3) the index contains one or more
columns that prefix the equality column, and 4) the prefix columns are
`NOT NULL` and are constrained to a set of constant values via a `CHECK`
constraint or an `IN` condition in the filter.


78940: backupccl: fix nil pointer exception in restore OnFailOrCancel r=dt,stevendanna a=adityamaru

The execCfg field on the restore resumer was not being set correctly
in the OnFailOrCancel hook. This would lead to a nil pointer exception
during cleanup.

Other jobs such as import, backup use the execCfg directly off the
JobExecContext instead of storing it on the resumer. An alternative to
this fix could be to change all `r.ExecCfg` to use the execCfg on the JobExecContext.

Fixes: #76720

Release note (bug fix): Fixes a nil pointer exception during the
cleanup of a failed or cancelled restore job.

78962: kvserver: make the StoreRebalancer interval a cluster setting r=aayushshah15 a=aayushshah15

Release note (ops change): the `kv.allocator.load_based_rebalancing_interval`
cluster setting now lets operators choose the interval at which each store in the
cluster will check for load-based lease or replica rebalancing opportunities.


78971: sql, migration: skip flaky TestPublicSchemaMigration test r=ajwerner a=RichardJCai

Release note: None

Co-authored-by: Nathan VanBenschoten <[email protected]>
Co-authored-by: Michael Butler <[email protected]>
Co-authored-by: richardjcai <[email protected]>
Co-authored-by: Marcus Gartner <[email protected]>
Co-authored-by: Aditya Maru <[email protected]>
Co-authored-by: Aayush Shah <[email protected]>
mgartner added a commit to mgartner/cockroach that referenced this pull request Mar 29, 2022
This commit fixes a logical correctness bug caused when
`GenerateLookupJoins` cross-joins the input of a semi-join with a set of
constant values to constrain the prefix columns of the lookup index. The
cross-join is an invalid transformation because it increases the size of
the join's input and can increase the size of the join's output.

We already avoid these cross-joins for left and anti-joins (see cockroachdb#59646).
When addressing those cases, the semi-join case was incorrectly assumed
to be safe.

Fixes cockroachdb#78681

Release note (bug fix): A bug has been fixed which caused the optimizer
to generate invalid query plans which could result in incorrect query
results. The bug, which has been present since version 21.1.0, can
appear if all of the following conditions are true: 1) the query
contains a semi-join, such as queries in the form:
`SELECT * FROM t1 WHERE EXISTS (SELECT * FROM t2 WHERE t1.a = t2.a);`,
2) the inner table has an index containing the equality column, like
`t2.a` in the example query, 3) the index contains one or more
columns that prefix the equality column, and 4) the prefix columns are
`NOT NULL` and are constrained to a set of constant values via a `CHECK`
constraint or an `IN` condition in the filter.
blathers-crl bot pushed a commit that referenced this pull request Mar 29, 2022
This commit fixes a logical correctness bug caused when
`GenerateLookupJoins` cross-joins the input of a semi-join with a set of
constant values to constrain the prefix columns of the lookup index. The
cross-join is an invalid transformation because it increases the size of
the join's input and can increase the size of the join's output.

We already avoid these cross-joins for left and anti-joins (see #59646).
When addressing those cases, the semi-join case was incorrectly assumed
to be safe.

Fixes #78681

Release note (bug fix): A bug has been fixed which caused the optimizer
to generate invalid query plans which could result in incorrect query
results. The bug, which has been present since version 21.1.0, can
appear if all of the following conditions are true: 1) the query
contains a semi-join, such as queries in the form:
`SELECT * FROM t1 WHERE EXISTS (SELECT * FROM t2 WHERE t1.a = t2.a);`,
2) the inner table has an index containing the equality column, like
`t2.a` in the example query, 3) the index contains one or more
columns that prefix the equality column, and 4) the prefix columns are
`NOT NULL` and are constrained to a set of constant values via a `CHECK`
constraint or an `IN` condition in the filter.
mgartner added a commit that referenced this pull request Mar 29, 2022
This commit fixes a logical correctness bug caused when
`GenerateLookupJoins` cross-joins the input of a semi-join with a set of
constant values to constrain the prefix columns of the lookup index. The
cross-join is an invalid transformation because it increases the size of
the join's input and can increase the size of the join's output.

We already avoid these cross-joins for left and anti-joins (see #59646).
When addressing those cases, the semi-join case was incorrectly assumed
to be safe.

Fixes #78681

Release note (bug fix): A bug has been fixed which caused the optimizer
to generate invalid query plans which could result in incorrect query
results. The bug, which has been present since version 21.1.0, can
appear if all of the following conditions are true: 1) the query
contains a semi-join, such as queries in the form:
`SELECT * FROM t1 WHERE EXISTS (SELECT * FROM t2 WHERE t1.a = t2.a);`,
2) the inner table has an index containing the equality column, like
`t2.a` in the example query, 3) the index contains one or more
columns that prefix the equality column, and 4) the prefix columns are
`NOT NULL` and are constrained to a set of constant values via a `CHECK`
constraint or an `IN` condition in the filter.
fqazi pushed a commit to fqazi/cockroach that referenced this pull request Apr 4, 2022
This commit fixes a logical correctness bug caused when
`GenerateLookupJoins` cross-joins the input of a semi-join with a set of
constant values to constrain the prefix columns of the lookup index. The
cross-join is an invalid transformation because it increases the size of
the join's input and can increase the size of the join's output.

We already avoid these cross-joins for left and anti-joins (see cockroachdb#59646).
When addressing those cases, the semi-join case was incorrectly assumed
to be safe.

Fixes cockroachdb#78681

Release note (bug fix): A bug has been fixed which caused the optimizer
to generate invalid query plans which could result in incorrect query
results. The bug, which has been present since version 21.1.0, can
appear if all of the following conditions are true: 1) the query
contains a semi-join, such as queries in the form:
`SELECT * FROM t1 WHERE EXISTS (SELECT * FROM t2 WHERE t1.a = t2.a);`,
2) the inner table has an index containing the equality column, like
`t2.a` in the example query, 3) the index contains one or more
columns that prefix the equality column, and 4) the prefix columns are
`NOT NULL` and are constrained to a set of constant values via a `CHECK`
constraint or an `IN` condition in the filter.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

opt: left lookup join returns incorrect results due to check constraints
3 participants