Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

opt: left lookup join returns incorrect results due to check constraints #59615

Closed
rytaft opened this issue Jan 30, 2021 · 3 comments · Fixed by #59646
Closed

opt: left lookup join returns incorrect results due to check constraints #59615

rytaft opened this issue Jan 30, 2021 · 3 comments · Fixed by #59646
Labels
A-multiregion Related to multi-region A-sql-optimizer SQL logical planning and optimizations. C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. S-0-visible-logical-error Database stores inconsistent data in some cases, or queries return invalid results silently. T-multiregion

Comments

@rytaft
Copy link
Collaborator

rytaft commented Jan 30, 2021

Consider the following example. First create a table with a check constraint and insert some data:

CREATE TABLE t (
  x INT NOT NULL CHECK (x IN (1, 3)),
  y INT NOT NULL,
  z INT,
  PRIMARY KEY (x, y)
);

INSERT INTO t SELECT 1, generate_series(1, 1000);
INSERT INTO t SELECT 3, generate_series(1, 1000);
ANALYZE t;

Now run the following query:

> SELECT * FROM (VALUES (1), (2000)) AS u (y) LEFT JOIN t ON u.y = t.y;
   y   |  x   |  y   |  z
-------+------+------+-------
     1 |    1 |    1 | NULL
     1 |    3 |    1 | NULL
  2000 | NULL | NULL | NULL
  2000 | NULL | NULL | NULL
(4 rows)

These results are incorrect. There should only be one row with u.y=2000, not two. The reason this happens is because in the GenerateLookupJoins exploration rule, the optimizer creates a cross join with the valid values of the check constraint in order to create a lookup join with the primary index:

> EXPLAIN (OPT) SELECT * FROM (VALUES (1), (2000)) AS u (y) LEFT JOIN t ON u.y = t.y;
             info
-------------------------------
  left-join (lookup t)
   ├── lookup columns are key
   ├── inner-join (cross)
   │    ├── values
   │    │    ├── (1,)
   │    │    └── (2000,)
   │    ├── values
   │    │    ├── (1,)
   │    │    └── (3,)
   │    └── filters (true)
   └── filters (true)
(11 rows)

If we remove the check constraint and try again, we now get correct results due to using a different plan:

> ALTER TABLE t DROP CONSTRAINT check_x;
> SELECT * FROM (VALUES (1), (2000)) AS u (y) LEFT JOIN t ON u.y = t.y;
   y   |  x   |  y   |  z
-------+------+------+-------
     1 |    1 |    1 | NULL
     1 |    3 |    1 | NULL
  2000 | NULL | NULL | NULL
(3 rows)

> EXPLAIN (OPT) SELECT * FROM (VALUES (1), (2000)) AS u (y) LEFT JOIN t ON u.y = t.y;
          info
-------------------------
  right-join (hash)
   ├── scan t
   ├── values
   │    ├── (1,)
   │    └── (2000,)
   └── filters
        └── column1 = y
(7 rows)

Although the cross join plan is correct in the case of an inner join, it is not correct for left joins. As a first step, we should disable the creation of the cross join as input to left lookup joins. Eventually, however, we should still be able to create a left lookup join in this case. I think that requires execution support to allow performing each lookup with multiple spans instead of a single equality condition.

This has implications for multi-region support, because the crdb_region column behaves the same way as column x in this example. Left joins often show up when building UPSERTs, which is how I ran into this issue in the first place. We need to fix this problem to avoid upserting incorrect data, but we should really find a way to support left lookup joins for this case for performance reasons -- to avoid requiring a full table scan on the primary index.

cc @mgartner @RaduBerinde

@rytaft rytaft added C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. A-sql-optimizer SQL logical planning and optimizations. S-0-visible-logical-error Database stores inconsistent data in some cases, or queries return invalid results silently. A-multiregion Related to multi-region T-multiregion labels Jan 30, 2021
@rytaft
Copy link
Collaborator Author

rytaft commented Jan 30, 2021

One stopgap that wouldn't require execution support would be to construct a UNION of several lookup joins.

@mgartner
Copy link
Collaborator

mgartner commented Feb 1, 2021

Wow, great catch. I have to be more careful extending these join rules. The good news is that this shouldn't be present in 20.2.

@rytaft
Copy link
Collaborator Author

rytaft commented Feb 1, 2021

Yea, only affects 21.1 alphas, so no major harm done! My fault too for not catching it as a reviewer.

craig bot pushed a commit that referenced this issue Feb 1, 2021
59646: opt: fix bug with incorrect results returned by left lookup/inverted join r=rytaft a=rytaft

This commit fixes a bug in which incorrect results could be returned for left
and anti lookup and inverted joins. This could happen when one of the prefix
columns was constrained to more than one constant value, either due to a check
constraint or an `IN` clause. The current implementation used to support lookup
and inverted joins for this case creates a cross-join between the constant
values and the input, thus artifically increasing the size of the input to the
join. This works for inner and semi joins, but it is not correct for left or
anti joins, because non-matching input rows will be returned multiple times in
the output, which is incorrect.

This commit fixes the bug by avoiding creating lookup or inverted joins in
these cases. However, this is suboptimal for performance, and in a future
commit we should try to find a way to create left/anti inverted/lookup joins
safely for these cases.

Fixes #59615

Release note (bug fix): Fixed a bug in which incorrect results could be
returned for left and anti joins. This could happen when one of the columns on
one side of the join was constrained to multiple constant values, either due to
a check constraint or an IN clause. The bug resulted in non-matching input rows
getting returned multiple times in the output, which is incorrect. This bug
only affected previous alpha releases of 21.1, and has now been fixed.

Co-authored-by: Rebecca Taft <[email protected]>
@craig craig bot closed this as completed in 11a600b Feb 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-multiregion Related to multi-region A-sql-optimizer SQL logical planning and optimizations. C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. S-0-visible-logical-error Database stores inconsistent data in some cases, or queries return invalid results silently. T-multiregion
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants