-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: tuples with NULLs don't compare sanely #12022
Comments
I'm not sure that we ever want to return NULL values for a query that has a predicate on those values. The logic for this is that the comparison would be unknown, like you pointed out, so the null value would never pass the filter. This is how our current ordering works (with and without indexes, just like PG), and I believe means that this issue is not needed.
Following this logic, the new issue would be making sure that that |
Tuple comparison is lexicographic, not pointwise, hence PostgreSQL does return true sometimes when comparing tuples with null:
|
That's the reason we can't rewrite y > (1, max) to y >= (2, null).
…On Tue, Dec 6, 2016 at 2:58 PM, Nathan VanBenschoten < ***@***.***> wrote:
Does that change anything with respect to the initial question?
nathan=# SELECT (2, 'inf'::float) < (2, null);
?column?
----------
(1 row)
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#12022 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ADrgipCQBGP0LvV5HHGMptBsckVBDs2-ks5rFb6CgaJpZM4LEjHF>
.
|
It sounds like we'll need to take this lexicographical comparison into account when performing the normalization from |
Only if we don't use an index, which puts null first. We could go deeper and rewrite to y' >= (2,), dropping the last component out. |
I might be missing something, but it doesn't look like we ever use indexes for inequalities between tuples. @RaduBerinde do you know if this is true, and if so, why we don't currently use them?
Regardless, dropping out the last component during index selection like you brought up seems like the correct approach. This is what I was hoping to achieve with the hacky normalization from y > (1, max) to y >= (1, null). We'll also need to take note that index selection currently adds in an implicit IS NOT NULL constraint for isolated end constraints, resulting in the behavior I described above. If/when we support the use of indexes for inequalities between tuples, similar behavior will be expected. |
Concrete test case:
returns
but |
The PR referenced above implements Nathan's "hacky normalization" strategy, documenting at length in the contract for Next/Prev why it works for now. |
@awoods187 I am bumping this issue because it's actually a correctness issue that's likely to bite us earlier than later. I am not sure how to best approach it though wrt a solution. This might need input from multiple people (Nathan, Peter, Ben, Radu come to mind). |
Hm, that looks like a bug. |
@justinj's issue no longer reproduces, nor does David Eisenstat's original reproduction. @knz, if you still have context, could you please provide a minimal reproduction of the problem here? I will edit the main body to include that reproduction. If there is no longer a reproduction, let's close this issue and re-open with something more concrete. |
This PR was meant to fix it: #27885 I was not able to complete the PR back then because of limitations in the type system. I think the work can be resumed now that the code has been greatly simplified by Andy. At any rate the examples/tests in the PR give you an idea of what's problematic. Note that the examples/tests in the PR also happen to be currently incorrect -- they are "reasonable" (I wrote them by extending the original problem scenarios in a way that was consistent and symmetric) but then Radu and I then later discovered that postgres actually diverges from what's reasonable. I even posted on the pg mailing list to ask "wtf" and the answer was "historical reasons". To summarize:
I'm not paged in on this today but we can discuss when I'm done with this week's other concurrent activities. |
Having reviewed this issue again, I'm going to rename it to be specific to tuple comparisons, which seem to still be somewhat messed up in the presence of nulls. @knz, if you disagree with this or have further context I'm missing, I'll politely request that you open a separate issue that's scoped just to the thing that's broken here besides tuple comparison (which I still can't figure out). Postgres:
Cockroach:
|
@yuzefovich this is related to recent work you were doing I think. Can you merge into the other one if appropriate? |
Previously, we treated all cases of `x IS NULL` as `x IS NOT DISTINCT FROM NULL`, and all cases of `x IS NOT NULL` as `x IS DISTINCT FROM NULL`. However, these transformations are not equivalent when `x` is a tuple. If all elements of `x` are `NULL`, then `x IS NULL` should evaluate to true, but `x IS DISTINCT FROM NULL` should evaluate to false. If one element of `x` is `NULL` and one is not null, then `x IS NOT NULL` should evaluate to false, but `x IS DISTINCT FROM NULL` should evaluate to true. Therefore, they are not equivalent. Below is a table of the correct semantics for tuple expressions. | Tuple | IS NOT DISTINCT FROM NULL | IS NULL | IS DISTINCT FROM NULL | IS NOT NULL | | ------------ | ------------------------- | --------- | --------------------- | ----------- | | (1, 1) | false | false | true | true | | (1, NULL) | false | **false** | true | **false** | | (NULL, NULL) | false | true | true | false | Notice that `IS NOT DISTINCT FROM NULL` is always the inverse of `IS DISTINCT FROM NULL`. However, `IS NULL` and `IS NOT NULL` are not inverses given the tuple `(1, NULL)`. This commit introduces new tree expressions for `IS NULL` and `IS NOT NULL`. These operators have evaluation logic that is different from `IS NOT DISTINCT FROM NULL` and `IS DISTINCT FROM NULL`, respectively. This commit also introduces new optimizer expression types, `IsTupleNull` and `IsTupleNotNull`. Normalization rules have been added for folding these expressions into boolean values when possible. Fixes cockroachdb#46675 Informs cockroachdb#46908 Informs cockroachdb#12022 Release note (bug fix): Fixes incorrect logic for `IS NULL` and `IS NOT NULL` operators with tuples, correctly differentiating them from `IS NOT DISTINCT FROM NULL` and `IS DISTINCT FROM NULL`, respectively.
48299: sql: fix tuple IS NULL logic r=mgartner a=mgartner Previously, we treated all cases of `x IS NULL` as `x IS NOT DISTINCT FROM NULL`, and all cases of `x IS NOT NULL` as `x IS DISTINCT FROM NULL`. However, these transformations are not equivalent when `x` is a tuple. If all elements of `x` are `NULL`, then `x IS NULL` should evaluate to true, but `x IS DISTINCT FROM NULL` should evaluate to false. If one element of `x` is `NULL` and one is not null, then `x IS NOT NULL` should evaluate to false, but `x IS DISTINCT FROM NULL` should evaluate to true. Therefore, they are not equivalent. Below is a table of the correct semantics for tuple expressions. | Tuple | IS NOT DISTINCT FROM NULL | IS NULL | IS DISTINCT FROM NULL | IS NOT NULL | | ------------ | ------------------------- | --------- | --------------------- | ----------- | | (1, 1) | false | false | true | true | | (1, NULL) | false | **false** | true | **false** | | (NULL, NULL) | false | true | true | false | Notice that `IS NOT DISTINCT FROM NULL` is always the inverse of `IS DISTINCT FROM NULL`. However, `IS NULL` and `IS NOT NULL` are not inverses given the tuple `(1, NULL)`. This commit introduces new tree expressions for `IS NULL` and `IS NOT NULL`. These operators have evaluation logic that is different from `IS NOT DISTINCT FROM NULL` and `IS DISTINCT FROM NULL`, respectively. While an expression such as `x IS NOT DISTINCT FROM NULL` is parsed as a `tree.ComparisonExpr` with a `tree.IsNotDisinctFrom` operator, execbuiler will output the simpler `tree.IsNullExpr` when the two expressions are equivalent - when x is not a tuple. This commit also introduces new optimizer expression types, `IsTupleNull` and `IsTupleNotNull`. Normalization rules have been added for folding these expressions into boolean values when possible. Fixes #46675 Informs #46908 Informs #12022 Release note (bug fix): Fixes incorrect logic for `IS NULL` and `IS NOT NULL` operators with tuples, correctly differentiating them from `IS NOT DISTINCT FROM NULL` and `IS DISTINCT FROM NULL`, respectively. Co-authored-by: Marcus Gartner <[email protected]>
None of the examples in this issue repro anymore, so I'm going to close this. |
Found by @eisenstatdavid in #10475 (comment)
Jira issue: CRDB-6132
The text was updated successfully, but these errors were encountered: