-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: wrong implementation of NULL predicate for row value expressions #46675
Comments
Thanks @lukaseder for the report - definitely a bug here. |
As it turns out, this seems not that easy to fix. The root of the problem, AFAIU, is that
However, in our comparison resolution we transform |
By the way, we have an old issue (#12022) with the last known reproduction being exactly this bug. I have a feeling that that issue describes a bigger set of bugs, so I'm not closing that one as duplicate, but it'd be great to see whether that issue is, in fact, a superset of this issue. |
Previously, we treated all cases of `x IS NULL` as `x IS NOT DISTINCT FROM NULL`, and all cases of `x IS NOT NULL` as `x IS DISTINCT FROM NULL`. However, these transformations are not equivalent when `x` is a tuple. If all elements of `x` are `NULL`, then `x IS NULL` should evaluate to true, but `x IS DISTINCT FROM NULL` should evaluate to false. If one element of `x` is `NULL` and one is not null, then `x IS NOT NULL` should evaluate to false, but `x IS DISTINCT FROM NULL` should evaluate to true. Therefore, they are not equivalent. Below is a table of the correct semantics for tuple expressions. | Tuple | IS NOT DISTINCT FROM NULL | IS NULL | IS DISTINCT FROM NULL | IS NOT NULL | | ------------ | ------------------------- | --------- | --------------------- | ----------- | | (1, 1) | false | false | true | true | | (1, NULL) | false | **false** | true | **false** | | (NULL, NULL) | false | true | true | false | Notice that `IS NOT DISTINCT FROM NULL` is always the inverse of `IS DISTINCT FROM NULL`. However, `IS NULL` and `IS NOT NULL` are not inverses given the tuple `(1, NULL)`. This commit introduces new tree expressions for `IS NULL` and `IS NOT NULL`. These operators have evaluation logic that is different from `IS NOT DISTINCT FROM NULL` and `IS DISTINCT FROM NULL`, respectively. This commit also introduces new optimizer expression types, `IsTupleNull` and `IsTupleNotNull`. Normalization rules have been added for folding these expressions into boolean values when possible. Fixes cockroachdb#46675 Informs cockroachdb#46908 Informs cockroachdb#12022 Release note (bug fix): Fixes incorrect logic for `IS NULL` and `IS NOT NULL` operators with tuples, correctly differentiating them from `IS NOT DISTINCT FROM NULL` and `IS DISTINCT FROM NULL`, respectively.
Prior to this commit, when hoisting Any expressions like `<left> = ANY (SELECT <right> ...)`, we constructed `(IsNot <left|right> Null)` expressions which are equivalent to `<left|right> IS DISTINCT FROM NULL`. As discovered in cockroachdb#46675, these expressions have different behavior than `<left> IS NOT NULL` when `<left>` is a tuple. As a result, the hoisting transformations could construct invalid plans that cause incorrect results. This commit fixes the issue by using `IsTupleNotNull` expressions when `<left>` and `<right> are tupleq. Fixes cockroachdb#98691 Release note (bug fix): A bug has been fixes that caused incorrect results of ANY comparisons of tuples. For example, an expression like `(x, y) = ANY (SELECT a, b FROM t WHERE ...)` could return `true` instead of the correct result of `NULL` when `x` and `y` were `NULL`, or `a` and `b` were `NULL`. This could only occur if the subquery was correlated, i.e., it references columns from the outer part of the query. This bug was present since the cost-based optimizer was introduced in version 2.1.
98517: multitenant: add AdminMerge capability r=knz a=ecwall Fixes #95138 AdminMerge is currently only called by the system tenant even though it is named similarly to other Admin* functions so it does not need its own capability for now. This changes its required capability from noCapCheckNeeded to onlySystemTenant to prevent secondary tenants from calling it. Release note: None 98580: rpc: bump threshold for latency jump reporting r=erikgrinaker a=tbg For months I've seen this misfire in nearly every single log line I've looked at, and I've had to grep it out in many L2 incidents. Maybe it works better when we suppress it for latencies <=50ms. Touches #96262. Fixes #98066. Epic: none Release note: None 98688: colexecbase: fix a recently introduced bug with identity cast r=yuzefovich a=yuzefovich This commit fixes a recently introduced bug that can occur when we're randomizing `coldata.BatchSize()` (which we do in tests). In particular, we capped a global singleton at one batch size value, but later we can change it to a higher value, which could lead to index out of bounds. This is now fixed by always using the max batch size. Fixes: #98660. Release note: None 98700: opt: fix hoist of ANY comparison with tuples r=mgartner a=mgartner #### opt: fix hoist of ANY comparison with tuples Prior to this commit, when hoisting Any expressions like `<left> = ANY (SELECT <right> ...)`, we constructed `(IsNot <left|right> Null)` expressions which are equivalent to `<left|right> IS DISTINCT FROM NULL`. As discovered in #46675, these expressions have different behavior than `<left> IS NOT NULL` when `<left>` is a tuple. As a result, the hoisting transformations could construct invalid plans that cause incorrect results. This commit fixes the issue by using `IsTupleNotNull` expressions when `<left>` and `<right> are tupleq. Fixes #98691 Release note (bug fix): A bug has been fixes that caused incorrect results of ANY comparisons of tuples. For example, an expression like `(x, y) = ANY (SELECT a, b FROM t WHERE ...)` could return `true` instead of the correct result of `NULL` when `x` and `y` were `NULL`, or `a` and `b` were `NULL`. This could only occur if the subquery was correlated, i.e., it references columns from the outer part of the query. This bug was present since the cost-based optimizer was introduced in version 2.1. Co-authored-by: Evan Wall <[email protected]> Co-authored-by: Tobias Grieger <[email protected]> Co-authored-by: Yahor Yuzefovich <[email protected]> Co-authored-by: Marcus Gartner <[email protected]>
Prior to this commit, when hoisting Any expressions like `<left> = ANY (SELECT <right> ...)`, we constructed `(IsNot <left|right> Null)` expressions which are equivalent to `<left|right> IS DISTINCT FROM NULL`. As discovered in #46675, these expressions have different behavior than `<left> IS NOT NULL` when `<left>` is a tuple. As a result, the hoisting transformations could construct invalid plans that cause incorrect results. This commit fixes the issue by using `IsTupleNotNull` expressions when `<left>` and `<right> are tupleq. Fixes #98691 Release note (bug fix): A bug has been fixes that caused incorrect results of ANY comparisons of tuples. For example, an expression like `(x, y) = ANY (SELECT a, b FROM t WHERE ...)` could return `true` instead of the correct result of `NULL` when `x` and `y` were `NULL`, or `a` and `b` were `NULL`. This could only occur if the subquery was correlated, i.e., it references columns from the outer part of the query. This bug was present since the cost-based optimizer was introduced in version 2.1.
Prior to this commit, when hoisting Any expressions like `<left> = ANY (SELECT <right> ...)`, we constructed `(IsNot <left|right> Null)` expressions which are equivalent to `<left|right> IS DISTINCT FROM NULL`. As discovered in cockroachdb#46675, these expressions have different behavior than `<left> IS NOT NULL` when `<left>` is a tuple. As a result, the hoisting transformations could construct invalid plans that cause incorrect results. This commit fixes the issue by using `IsTupleNotNull` expressions when `<left>` and `<right> are tupleq. Fixes cockroachdb#98691 Release note (bug fix): A bug has been fixes that caused incorrect results of ANY comparisons of tuples. For example, an expression like `(x, y) = ANY (SELECT a, b FROM t WHERE ...)` could return `true` instead of the correct result of `NULL` when `x` and `y` were `NULL`, or `a` and `b` were `NULL`. This could only occur if the subquery was correlated, i.e., it references columns from the outer part of the query. This bug was present since the cost-based optimizer was introduced in version 2.1. Release note (<category, see below>): <what> <show> <why>
Prior to this commit, when hoisting Any expressions like `<left> = ANY (SELECT <right> ...)`, we constructed `(IsNot <left|right> Null)` expressions which are equivalent to `<left|right> IS DISTINCT FROM NULL`. As discovered in #46675, these expressions have different behavior than `<left> IS NOT NULL` when `<left>` is a tuple. As a result, the hoisting transformations could construct invalid plans that cause incorrect results. This commit fixes the issue by using `IsTupleNotNull` expressions when `<left>` and `<right> are tupleq. Fixes #98691 Release note (bug fix): A bug has been fixes that caused incorrect results of ANY comparisons of tuples. For example, an expression like `(x, y) = ANY (SELECT a, b FROM t WHERE ...)` could return `true` instead of the correct result of `NULL` when `x` and `y` were `NULL`, or `a` and `b` were `NULL`. This could only occur if the subquery was correlated, i.e., it references columns from the outer part of the query. This bug was present since the cost-based optimizer was introduced in version 2.1.
Describe the problem
It appears that the NULL predicate is not implemented correctly (or not at all?) for row value expressions.
To Reproduce
Run this query:
And get the following, wrong result:
Expected behavior
It should either work as in the SQL standard (ISO/IEC 9075-2:2016(E)
8.8 <null predicate>) / PostgreSQL:
Or raise an error
Environment:
The text was updated successfully, but these errors were encountered: