Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: tuples with NULLs don't compare sanely #12022

Closed
knz opened this issue Dec 5, 2016 · 48 comments
Closed

sql: tuples with NULLs don't compare sanely #12022

knz opened this issue Dec 5, 2016 · 48 comments
Labels
A-sql-pgcompat Semantic compatibility with PostgreSQL A-sql-semantics C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. S-3-erroneous-edge-case Database produces or stores erroneous data without visible error/warning, in rare edge cases. T-sql-queries SQL Queries Team
Milestone

Comments

@knz
Copy link
Contributor

knz commented Dec 5, 2016

Found by @eisenstatdavid in #10475 (comment)

The logic for not defining the previous element of (e.g.) -inf to be null is that, in a filter expression like col < -inf, the null values of col don't satisfy the filter (because null < -inf is null (unknown, technically, but we conflate unknown and boolean null) when we're using the user-accessible builtin and not .Compare), and thus should be excluded.

For arrays, I think Next(a) = a + [null] is good enough for the limited purpose of scanning an index. If we unthinkingly rewrite y > [1] to y >= [1, null], then despite the fact that [1, 2] > [1] is true, [1, 2] > [1, null] should be unknown (guessing this is what PostgreSQL does from its handling of tuples; IIRC, we don't do this because the array builtin calls .Compare, but we should).

For tuples also, we don't have a good answer. (2, null) sorts in between (1, max) and (2, min), so we should include (2, null) when there's a > (1, max) filter. This means that defining Next((1, max)) = (2, min) is bad, because then it's not safe to rewrite, e.g., y > (1, max) to y >= (2, min), because the semantics for (2, null) changes, from true to either false (.Compare) or null (as the builtin should return, but doesn't yet). Defining Next((1, max)) = (2, null) has the problem described above.

Jira issue: CRDB-6132

@knz knz added C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. A-sql-semantics labels Dec 5, 2016
@nvanbenschoten
Copy link
Member

I'm not sure that we ever want to return NULL values for a query that has a predicate on those values. The logic for this is that the comparison would be unknown, like you pointed out, so the null value would never pass the filter. This is how our current ordering works (with and without indexes, just like PG), and I believe means that this issue is not needed.

root@:26257> select a, b from t order by b;
+---+-------+
| a |   b   |
+---+-------+
| 3 | NULL  |
| 1 | false |
| 2 | true  |
+---+-------+
(3 rows)
root@:26257> select a, b from t where b > false order by b;
+---+------+
| a |  b   |
+---+------+
| 2 | true |
+---+------+
(1 row)
root@:26257> select a, b from t where b <= false order by b;
+---+-------+
| a |   b   |
+---+-------+
| 1 | false |
+---+-------+
(1 row)

Following this logic, the new issue would be making sure that that (3, null) is not returned from a query with a predicate of y > (1, max).

@eisenstatdavid
Copy link

Tuple comparison is lexicographic, not pointwise, hence PostgreSQL does return true sometimes when comparing tuples with null:

eisen=# SELECT (1, 'inf'::float) < (3, null);
 ?column?
----------
 t
(1 row)

@eisenstatdavid
Copy link

eisenstatdavid commented Dec 6, 2016 via email

@nvanbenschoten
Copy link
Member

It sounds like we'll need to take this lexicographical comparison into account when performing the normalization from y > (1, max) to y >= some tuple. I don't think that means we'll need to change the semantics of Next, but instead maybe decouple these two pieces of logic. Interestingly, normalizing y > (1, max) to y >= (1, null) would work because of what I mentioned earlier.

@eisenstatdavid
Copy link

Interestingly, normalizing y > (1, max) to y >= (1, null) would work because of what I mentioned earlier.

Only if we don't use an index, which puts null first. We could go deeper and rewrite to y' >= (2,), dropping the last component out.

@nvanbenschoten
Copy link
Member

I might be missing something, but it doesn't look like we ever use indexes for inequalities between tuples. @RaduBerinde do you know if this is true, and if so, why we don't currently use them?

root@:26257> EXPLAIN SELECT a, b FROM t WHERE (a, b) = (2, true);
+-------+------+-------------------------+
| Level | Type |       Description       |
+-------+------+-------------------------+
|     0 | scan | tt@tt_a_b_idx /2/1-/2/2 |
+-------+------+-------------------------+
(1 row)
root@:26257> EXPLAIN SELECT a, b FROM t WHERE (a, b) > (2, true);
+-------+------+-----------------+
| Level | Type |   Description   |
+-------+------+-----------------+
|     0 | scan | tt@tt_a_b_idx - |
+-------+------+-----------------+
(1 row)

Regardless, dropping out the last component during index selection like you brought up seems like the correct approach. This is what I was hoping to achieve with the hacky normalization from y > (1, max) to y >= (1, null).

We'll also need to take note that index selection currently adds in an implicit IS NOT NULL constraint for isolated end constraints, resulting in the behavior I described above. If/when we support the use of indexes for inequalities between tuples, similar behavior will be expected.

@RaduBerinde
Copy link
Member

There are various known gaps in the span generation code (we have some issues filed #6390, #6346).

@petermattis petermattis added this to the 1.0 milestone Feb 23, 2017
@cuongdo cuongdo assigned justinj and unassigned nvanbenschoten Apr 20, 2017
@eisenstatdavid
Copy link

Concrete test case:

CREATE DATABASE d;
SET DATABASE = d;

CREATE TABLE t (
  c1 INT,
  c2 INT,
  UNIQUE INDEX i (c1, c2)
);

INSERT INTO t VALUES
  (NULL, NULL), (NULL, 1), (NULL, 2),
  (1, NULL), (1, 1), (1, 2),
  (2, NULL), (2, 1), (2, 2);

SELECT * FROM t WHERE (c1, c2) > (1, 9223372036854775807);

returns

c1	c2
2	1
2	2

but 2 NULL should be included.

@eisenstatdavid
Copy link

eisenstatdavid commented May 2, 2017

The PR referenced above implements Nathan's "hacky normalization" strategy, documenting at length in the contract for Next/Prev why it works for now.

@eisenstatdavid eisenstatdavid modified the milestones: 1.1, 1.0, Later May 3, 2017
@eisenstatdavid eisenstatdavid assigned knz and unassigned eisenstatdavid Jul 27, 2017
@knz knz removed their assignment Nov 1, 2017
@knz knz modified the milestones: Later, 1.2 Nov 1, 2017
@knz
Copy link
Contributor Author

knz commented Nov 1, 2017

@awoods187 I am bumping this issue because it's actually a correctness issue that's likely to bite us earlier than later. I am not sure how to best approach it though wrt a solution. This might need input from multiple people (Nathan, Peter, Ben, Radu come to mind).

@RaduBerinde
Copy link
Member

Hm, that looks like a bug. IS NULL and similar should only return true or false.

@jordanlewis
Copy link
Member

@justinj's issue no longer reproduces, nor does David Eisenstat's original reproduction.

@knz, if you still have context, could you please provide a minimal reproduction of the problem here? I will edit the main body to include that reproduction. If there is no longer a reproduction, let's close this issue and re-open with something more concrete.

@knz
Copy link
Contributor Author

knz commented May 14, 2019

This PR was meant to fix it: #27885

I was not able to complete the PR back then because of limitations in the type system. I think the work can be resumed now that the code has been greatly simplified by Andy.

At any rate the examples/tests in the PR give you an idea of what's problematic. Note that the examples/tests in the PR also happen to be currently incorrect -- they are "reasonable" (I wrote them by extending the original problem scenarios in a way that was consistent and symmetric) but then Radu and I then later discovered that postgres actually diverges from what's reasonable. I even posted on the pg mailing list to ask "wtf" and the answer was "historical reasons".

To summarize:

I'm not paged in on this today but we can discuss when I'm done with this week's other concurrent activities.

@knz knz removed their assignment Aug 27, 2019
@knz knz added the A-sql-pgcompat Semantic compatibility with PostgreSQL label Aug 27, 2019
@jordanlewis
Copy link
Member

Having reviewed this issue again, I'm going to rename it to be specific to tuple comparisons, which seem to still be somewhat messed up in the presence of nulls. @knz, if you disagree with this or have further context I'm missing, I'll politely request that you open a separate issue that's scoped just to the thing that's broken here besides tuple comparison (which I still can't figure out).

Postgres:

jordan=# select (1, null) is null, (1, null) is not null, (1, null) is distinct from null, (1, null) is not distinct from null;
 ?column? | ?column? | ?column? | ?column?
----------+----------+----------+----------
 f        | f        | t        | f
(1 row)

Cockroach:

[email protected]:54289/d> select (1, null) is null, (1, null) is not null, (1, null) is distinct from null, (1, null) is not distinct from null;
  ?column? | ?column? | ?column? | ?column?
+----------+----------+----------+----------+
   false   |   true   |   true   |  false
(1 row)

@jordanlewis jordanlewis changed the title sql: the ordering of NULL is internally inconsistent sql: tuples with NULLs don't compare sanely Sep 19, 2019
@jordanlewis
Copy link
Member

@yuzefovich this is related to recent work you were doing I think. Can you merge into the other one if appropriate?

@yuzefovich
Copy link
Member

I handed over #46675 to the optimizer team. Regarding merging this issue with that one, I'm worried that this issue is a superset of #46675, so I don't want to close this.

@yuzefovich yuzefovich removed their assignment Apr 10, 2020
mgartner added a commit to mgartner/cockroach that referenced this issue May 11, 2020
Previously, we treated all cases of `x IS NULL` as `x IS NOT DISTINCT
FROM NULL`, and all cases of `x IS NOT NULL` as `x IS DISTINCT FROM
NULL`. However, these transformations are not equivalent when `x` is a
tuple.

If all elements of `x` are `NULL`, then `x IS NULL` should evaluate to
true, but `x IS DISTINCT FROM NULL` should evaluate to false. If one
element of `x` is `NULL` and one is not null, then `x IS NOT NULL`
should evaluate to false, but `x IS DISTINCT FROM NULL` should evaluate
to true. Therefore, they are not equivalent.

Below is a table of the correct semantics for tuple expressions.

| Tuple        | IS NOT DISTINCT FROM NULL | IS NULL   | IS DISTINCT FROM NULL | IS NOT NULL |
| ------------ | ------------------------- | --------- | --------------------- | ----------- |
| (1, 1)       | false                     | false     | true                  | true        |
| (1, NULL)    | false                     | **false** | true                  | **false**   |
| (NULL, NULL) | false                     | true      | true                  | false       |

Notice that `IS NOT DISTINCT FROM NULL` is always the inverse of
`IS DISTINCT FROM NULL`. However, `IS NULL` and `IS NOT NULL` are not
inverses given the tuple `(1, NULL)`.

This commit introduces new tree expressions for `IS NULL` and `IS NOT
NULL`. These operators have evaluation logic that is different from `IS
NOT DISTINCT FROM NULL` and `IS DISTINCT FROM NULL`, respectively.

This commit also introduces new optimizer expression types,
`IsTupleNull` and `IsTupleNotNull`. Normalization rules have been added
for folding these expressions into boolean values when possible.

Fixes   cockroachdb#46675
Informs cockroachdb#46908
Informs cockroachdb#12022

Release note (bug fix): Fixes incorrect logic for `IS NULL` and `IS NOT
NULL` operators with tuples, correctly differentiating them from `IS NOT
DISTINCT FROM NULL` and `IS DISTINCT FROM NULL`, respectively.
craig bot pushed a commit that referenced this issue May 15, 2020
48299: sql: fix tuple IS NULL logic r=mgartner a=mgartner

Previously, we treated all cases of `x IS NULL` as `x IS NOT DISTINCT
FROM NULL`, and all cases of `x IS NOT NULL` as `x IS DISTINCT FROM
NULL`. However, these transformations are not equivalent when `x` is a
tuple.

If all elements of `x` are `NULL`, then `x IS NULL` should evaluate to
true, but `x IS DISTINCT FROM NULL` should evaluate to false. If one
element of `x` is `NULL` and one is not null, then `x IS NOT NULL`
should evaluate to false, but `x IS DISTINCT FROM NULL` should evaluate
to true. Therefore, they are not equivalent.

Below is a table of the correct semantics for tuple expressions.

| Tuple        | IS NOT DISTINCT FROM NULL | IS NULL   | IS DISTINCT FROM NULL | IS NOT NULL |
| ------------ | ------------------------- | --------- | --------------------- | ----------- |
| (1, 1)       | false                     | false     | true                  | true        |
| (1, NULL)    | false                     | **false** | true                  | **false**   |
| (NULL, NULL) | false                     | true      | true                  | false       |

Notice that `IS NOT DISTINCT FROM NULL` is always the inverse of
`IS DISTINCT FROM NULL`. However, `IS NULL` and `IS NOT NULL` are not
inverses given the tuple `(1, NULL)`.

This commit introduces new tree expressions for `IS NULL` and `IS NOT
NULL`. These operators have evaluation logic that is different from `IS
NOT DISTINCT FROM NULL` and `IS DISTINCT FROM NULL`, respectively. While
an expression such as `x IS NOT DISTINCT FROM NULL` is parsed as a
`tree.ComparisonExpr` with a `tree.IsNotDisinctFrom` operator,
execbuiler will output the simpler `tree.IsNullExpr` when the two
expressions are equivalent - when x is not a tuple.

This commit also introduces new optimizer expression types,
`IsTupleNull` and `IsTupleNotNull`. Normalization rules have been added
for folding these expressions into boolean values when possible.

Fixes   #46675
Informs #46908
Informs #12022

Release note (bug fix): Fixes incorrect logic for `IS NULL` and `IS NOT
NULL` operators with tuples, correctly differentiating them from `IS NOT
DISTINCT FROM NULL` and `IS DISTINCT FROM NULL`, respectively.


Co-authored-by: Marcus Gartner <[email protected]>
@jlinder jlinder added the T-sql-queries SQL Queries Team label Jun 16, 2021
@jordanlewis
Copy link
Member

None of the examples in this issue repro anymore, so I'm going to close this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-sql-pgcompat Semantic compatibility with PostgreSQL A-sql-semantics C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. S-3-erroneous-edge-case Database produces or stores erroneous data without visible error/warning, in rare edge cases. T-sql-queries SQL Queries Team
Projects
None yet
Development

Successfully merging a pull request may close this issue.