Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: vectorized engine incorrectly evaluates NOT (x IN ()) #88141

Closed
mgartner opened this issue Sep 19, 2022 · 3 comments · Fixed by #88195
Closed

sql: vectorized engine incorrectly evaluates NOT (x IN ()) #88141

mgartner opened this issue Sep 19, 2022 · 3 comments · Fixed by #88195
Assignees
Labels
C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-sql-queries SQL Queries Team

Comments

@mgartner
Copy link
Collaborator

mgartner commented Sep 19, 2022

The unoptimized-query-oracle found a bug which appears to be in the vectorized engine.

The issue is that the vectorized engine does not correctly evaluate the expression NOT (x IN ()). Under normal operation, such an expression is normalized to true by the optimizer by FoldInEmpty and FoldNotInEmpty, so the vectorized engine never evaluates this expression.

CREATE TABLE t (
  i INTERVAL
);

INSERT INTO t (i) VALUES (NULL);

SET testing_optimizer_random_seed = 6320964980407535657;

SET testing_optimizer_disable_rule_probability = 0.500000;

SET vectorize = off;

-- Returns 1 row.
SELECT i
FROM t
WHERE NOT (i IN (
    SELECT '1 day'::INTERVAL
    FROM t t1 JOIN t t2 ON true
    WHERE false
));

RESET vectorize;

-- Returns 0 rows.
SELECT i
FROM t
WHERE NOT (i IN (
    SELECT '1 day'::INTERVAL
    FROM t t1 JOIN t t2 ON true
    WHERE false
));

The query plans for both queries are the same:

  project
   ├── columns: i:1
   ├── stats: [rows=333.3333]
   ├── cost: 1158.60333
   ├── distribution: us-east1
   ├── prune: (1)
   └── select
        ├── columns: t.i:1 t.rowid:2 t.crdb_internal_mvcc_timestamp:3 t.tableoid:4
        ├── stats: [rows=333.3333]
        ├── cost: 1155.25
        ├── key: (2)
        ├── fd: (2)-->(1,3,4)
        ├── distribution: us-east1
        ├── scan t
        │    ├── columns: t.i:1 t.rowid:2 t.crdb_internal_mvcc_timestamp:3 t.tableoid:4
        │    ├── stats: [rows=1000, distinct(2)=1000, null(2)=0]
        │    ├── cost: 1145.22
        │    ├── key: (2)
        │    ├── fd: (2)-->(1,3,4)
        │    └── distribution: us-east1
        └── filters
             └── NOT (t.i:1 IN ()) [outer=(1)]

Jira issue: CRDB-19708

@mgartner mgartner added C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. branch-release-22.2 Used to mark GA and release blockers, technical advisories, and bugs for 22.2 labels Sep 19, 2022
@blathers-crl blathers-crl bot added the T-sql-queries SQL Queries Team label Sep 19, 2022
@mgartner
Copy link
Collaborator Author

I am confused why the SET testing_optimizer_random_seed = 6320964980407535657; and SET testing_optimizer_disable_rule_probability = 0.500000; are necessary to reproduce this issue when the resulting query plans are the same.

@mgartner
Copy link
Collaborator Author

Ahh, it looks like the failure is only possible when some normalization rules do not fire. I'll narrow down which ones. This is most likely not a release blocker.

@mgartner mgartner changed the title roachtest: unoptimized-query-oracle IN result from vectorized engine sql: vectorized engine incorrectly evaluates NOT (x INT ()) Sep 19, 2022
@mgartner mgartner changed the title sql: vectorized engine incorrectly evaluates NOT (x INT ()) sql: vectorized engine incorrectly evaluates NOT (x IN ()) Sep 19, 2022
@mgartner
Copy link
Collaborator Author

I've confirmed that when FoldInEmpty and FoldNotInEmpty are disabled on v22.1.7, the issue is present. We should fix this, but I don't think it's a release blocker. I'll remove the label. I'll also mark these two rules as required during costfuzz and unoptimized-query-oracle so that we don't continue to rediscover this bug until the issue is fixed.

@mgartner mgartner removed release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. branch-release-22.2 Used to mark GA and release blockers, technical advisories, and bugs for 22.2 blocks-22.2.0-beta.1 labels Sep 19, 2022
mgartner added a commit to mgartner/cockroach that referenced this issue Sep 19, 2022
These normalization rules are essential for costfuzz and
unoptimized-query-oracle tests until cockroachdb#88141 is fixed.

Release note: None
craig bot pushed a commit that referenced this issue Sep 19, 2022
88075: codeowners, roachtest, team: rename bulk-io to disaster-recovery r=celiala a=celiala

Reflecting `bulk-io` to `disaster-recovery` team rename in:
- github CODEOWNERS
- pkg/cmd/roachtest owners
- pkg/internal/team team
- TEAMS.yaml

Release note: None

Partially fixes: DEVINFHD-652

88152: opt: mark FoldInEmpty and FoldNotInEmpty as essential r=mgartner a=mgartner

These normalization rules are essential for costfuzz and unoptimized-query-oracle tests until #88141 is fixed.

Release note: None

Co-authored-by: Celia La <[email protected]>
Co-authored-by: Marcus Gartner <[email protected]>
mgartner added a commit to mgartner/cockroach that referenced this issue Sep 19, 2022
These normalization rules are essential for costfuzz and
unoptimized-query-oracle tests until cockroachdb#88141 is fixed.

Release note: None
@yuzefovich yuzefovich self-assigned this Sep 19, 2022
@craig craig bot closed this as completed in #88195 Sep 20, 2022
@craig craig bot closed this as completed in f363dba Sep 20, 2022
@mgartner mgartner moved this to Done in SQL Queries Jul 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-sql-queries SQL Queries Team
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

3 participants