Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: unoptimized, vectorized query incorrectly returns a row when it should return none #87434

Closed
mgartner opened this issue Sep 6, 2022 · 3 comments · Fixed by #87451
Closed
Assignees
Labels
C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-sql-queries SQL Queries Team

Comments

@mgartner
Copy link
Collaborator

mgartner commented Sep 6, 2022

I've reduce this failure to:

CREATE TABLE t (
  k STRING PRIMARY KEY,
  a STRING,
  UNIQUE (lower(k::STRING))
);

SET testing_optimizer_random_seed = 4774601380782706250;

SET testing_optimizer_disable_rule_probability = 0.500000;

SET vectorize = off;

SELECT 'a'
FROM t AS t1
JOIN t AS t2 ON t1.k = t2.a
WHERE false
GROUP BY t2.tableoid;

RESET vectorize;

SELECT 'a'
FROM t AS t1
JOIN t AS t2 ON t1.k = t2.a
WHERE false
GROUP BY t2.tableoid;

The second SELECT returns a single row, while the first returns no rows. Both should return no rows.

Both queries have the same query plan:

  project
   ├── columns: "?column?":11
   ├── cardinality: [0 - 0]
   ├── immutable
   ├── stats: [rows=0]
   ├── cost: 0.09
   ├── key: ()
   ├── fd: ()-->(11)
   ├── distribution: us-east1
   ├── group-by (streaming)
   │    ├── cardinality: [0 - 0]
   │    ├── immutable
   │    ├── stats: [rows=0]
   │    ├── cost: 0.08
   │    ├── key: ()
   │    ├── distribution: us-east1
   │    └── project
   │         ├── columns: tableoid:10
   │         ├── cardinality: [0 - 0]
   │         ├── immutable
   │         ├── stats: [rows=0]
   │         ├── cost: 0.06
   │         ├── key: ()
   │         ├── fd: ()-->(10)
   │         ├── distribution: us-east1
   │         └── inner-join (cross)
   │              ├── columns: k:1 a:2 crdb_internal_idx_expr:3 crdb_internal_mvcc_timestamp:4 tableoid:5 k:6 a:7 crdb_internal_idx_expr:8 crdb_internal_mvcc_timestamp:9 tableoid:10
   │              ├── cardinality: [0 - 0]
   │              ├── multiplicity: left-rows(zero-or-one), right-rows(zero-or-one)
   │              ├── immutable
   │              ├── stats: [rows=0]
   │              ├── cost: 0.05
   │              ├── key: ()
   │              ├── fd: ()-->(1-10)
   │              ├── distribution: us-east1
   │              ├── project
   │              │    ├── columns: crdb_internal_idx_expr:3 k:1 a:2 crdb_internal_mvcc_timestamp:4 tableoid:5
   │              │    ├── cardinality: [0 - 0]
   │              │    ├── immutable
   │              │    ├── stats: [rows=0]
   │              │    ├── cost: 0.02
   │              │    ├── key: ()
   │              │    ├── fd: ()-->(1-5)
   │              │    ├── distribution: us-east1
   │              │    ├── values
   │              │    │    ├── columns: k:1 a:2 crdb_internal_mvcc_timestamp:4 tableoid:5
   │              │    │    ├── cardinality: [0 - 0]
   │              │    │    ├── stats: [rows=0]
   │              │    │    ├── cost: 0.01
   │              │    │    ├── key: ()
   │              │    │    ├── fd: ()-->(1,2,4,5)
   │              │    │    └── distribution: us-east1
   │              │    └── projections
   │              │         └── lower(k:1) [as=crdb_internal_idx_expr:3, outer=(1), immutable]
   │              ├── project
   │              │    ├── columns: crdb_internal_idx_expr:8 k:6 a:7 crdb_internal_mvcc_timestamp:9 tableoid:10
   │              │    ├── cardinality: [0 - 0]
   │              │    ├── immutable
   │              │    ├── stats: [rows=0]
   │              │    ├── cost: 0.02
   │              │    ├── key: ()
   │              │    ├── fd: ()-->(6-10)
   │              │    ├── distribution: us-east1
   │              │    ├── values
   │              │    │    ├── columns: k:6 a:7 crdb_internal_mvcc_timestamp:9 tableoid:10
   │              │    │    ├── cardinality: [0 - 0]
   │              │    │    ├── stats: [rows=0]
   │              │    │    ├── cost: 0.01
   │              │    │    ├── key: ()
   │              │    │    ├── fd: ()-->(6,7,9,10)
   │              │    │    └── distribution: us-east1
   │              │    └── projections
   │              │         └── lower(k:6) [as=crdb_internal_idx_expr:8, outer=(6), immutable]
   │              └── filters (true)
   └── projections
        └── 'a' [as="?column?":11]

So, my guess it that there is a bug in the vectorized execution engine.

Jira issue: CRDB-19346

@mgartner mgartner added the C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. label Sep 6, 2022
@blathers-crl blathers-crl bot added the T-sql-queries SQL Queries Team label Sep 6, 2022
@mgartner
Copy link
Collaborator Author

mgartner commented Sep 6, 2022

The testing_optimizer_random_seed and testing_optimizer_disable_rule_probability settings are not supported in 22.1 or before, so it's not easy to tell if this is a regression or not. I've added the release blocker label for now until we can prove that it is present on older versions or not.

@mgartner mgartner added release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. branch-release-22.2 Used to mark GA and release blockers, technical advisories, and bugs for 22.2 labels Sep 6, 2022
@michae2
Copy link
Collaborator

michae2 commented Sep 6, 2022

We might be able to determine if this is a regression without backporting those settings, once we know the cause.

@yuzefovich
Copy link
Member

This is a pre-existing bug, so it's not a release blocker, but I do have a fix.

@yuzefovich yuzefovich removed release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. branch-release-22.2 Used to mark GA and release blockers, technical advisories, and bugs for 22.2 labels Sep 6, 2022
@craig craig bot closed this as completed in c29bfab Sep 6, 2022
@mgartner mgartner moved this to Done in SQL Queries Jul 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-sql-queries SQL Queries Team
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

3 participants