Skip to content

Commit

Permalink
opt: do not cross-join input of semi-join
Browse files Browse the repository at this point in the history
This commit fixes a logical correctness bug caused when
`GenerateLookupJoins` cross-joins the input of a semi-join with a set of
constant values to constrain the prefix columns of the lookup index. The
cross-join is an invalid transformation because it increases the size of
the join's input and can increase the size of the join's output.

We already avoid these cross-joins for left and anti-joins (see cockroachdb#59646).
When addressing those cases, the semi-join case was incorrectly assumed
to be safe.

Fixes cockroachdb#78681

Release note (bug fix): A bug has been fixed which caused the optimizer
to generate invalid query plans which could result in incorrect query
results. The bug, which has been present since version 21.1.0, can
appear if all of the following conditions are true: 1) the query
contains a semi-join, such as queries in the form:
`SELECT * FROM t1 WHERE EXISTS (SELECT * FROM t2 WHERE t1.a = t2.a);`,
2) the inner table has an index containing the equality column, like
`t2.a` in the example query, 3) the index contains one or more
columns that prefix the equality column, and 4) the prefix columns are
`NOT NULL` and are constrained to a set of constant values via a `CHECK`
constraint or an `IN` condition in the filter.
  • Loading branch information
mgartner committed Mar 29, 2022
1 parent cf7cd84 commit 2d3ed07
Show file tree
Hide file tree
Showing 4 changed files with 401 additions and 264 deletions.
30 changes: 30 additions & 0 deletions pkg/sql/logictest/testdata/logic_test/lookup_join
Original file line number Diff line number Diff line change
Expand Up @@ -575,6 +575,36 @@ SELECT * FROM (VALUES (1), (2)) AS u(y) WHERE NOT EXISTS (
1
2

# Regression test for #78681. Ensure that invalid lookup joins are not created
# for semi joins.
statement ok
CREATE TABLE t78681 (
x INT NOT NULL CHECK (x in (1, 3)),
y INT NOT NULL,
PRIMARY KEY (x, y)
)

# Insert stats so that a lookup semi-join is selected.
statement ok
ALTER TABLE t78681 INJECT STATISTICS '[
{
"columns": ["x"],
"created_at": "2018-05-01 1:00:00.00000+00:00",
"row_count": 10000000,
"distinct_count": 2
}
]'

statement ok
INSERT INTO t78681 VALUES (1, 1), (3, 1)

query I rowsort
SELECT * FROM (VALUES (1), (2)) AS u(y) WHERE EXISTS (
SELECT * FROM t78681 t WHERE u.y = t.y
)
----
1

statement ok
CREATE TABLE lookup_expr (
r STRING NOT NULL CHECK (r IN ('east', 'west')),
Expand Down
Loading

0 comments on commit 2d3ed07

Please sign in to comment.