Skip to content

Commit

Permalink
Browse files Browse the repository at this point in the history
97151: opt: fix incorrect results from zigzag join with different index directions r=rytaft a=rytaft

Prior to this commit, the optimizer could plan a zigzag join between two indexes where the matching column(s) had ascending values in one index and descending values in the other. This could cause incorrect results, because it broke an assumption of the `zigzagJoiner`.

For example, consider this table and query:
```
CREATE TABLE t (
  c INT NOT NULL,
  l INT NOT NULL,
  r INT NOT NULL,
  INDEX (l ASC, c DESC),
  INDEX (r ASC, c ASC)
);
INSERT INTO t VALUES (1, 1, -1), (2, 1, -2);
SELECT * FROM t@{FORCE_ZIGZAG} WHERE l = 1 AND r = -1;
```
This query should produce 1 row, but prior to this commit it produced 0 rows. This was due to the fact that the direction of `c` was different between the two indexes.

This commit fixes the issue by preventing the optimizer from planning a zigzag join in such cases.

Fixes cockroachdb#97090

Release note (bug fix): Fixed a bug in the query engine that could cause incorrect results in some cases when a zigzag join was planned. The bug could occur when the two indexes used for the zigzag join had a suffix of matching columns but with different directions. For example, planning a zigzag join with `INDEX(a ASC, b ASC)` and `INDEX(c ASC, b DESC)` could cause incorrect results. This bug has existed since at least v19.1. It is now fixed, because the optimizer will no longer plan a zigzag join in such cases.

97435: tree: fix DatumPrev for collated string r=yuzefovich a=yuzefovich

This commit removes the support of collated strings from `DatumPrev`. As it turns out, the comparison of collated strings is done by comparing their collation keys, and the current method of generating the previous string (simply subtracting 1 from the last non-zero byte) doesn't work. Similar bug in `DatumNext` was fixed in
1902d3d (which added the collated strings into randomly generated types), but I didn't stress the test to catch this bug. The impact of the bug is minor since it's only used in `debug statement-bundle recreate` command.

Epic: None

Release note: None

Co-authored-by: Rebecca Taft <[email protected]>
Co-authored-by: Yahor Yuzefovich <[email protected]>
  • Loading branch information
3 people committed Feb 22, 2023
3 parents fb1b2ec + f54f357 + 7af77d4 commit ba65327
Show file tree
Hide file tree
Showing 5 changed files with 69 additions and 12 deletions.
18 changes: 18 additions & 0 deletions pkg/sql/logictest/testdata/logic_test/zigzag_join
Original file line number Diff line number Diff line change
Expand Up @@ -340,3 +340,21 @@ CREATE TABLE t71271(a INT, b INT, c INT, d INT, INDEX (c), INDEX (d))

statement ok
SELECT d FROM t71271 WHERE c = 3 AND d = 4

# Regression test for #97090. We should not plan a zigzag join when the
# direction of the equality columns doesn't match.
statement ok
CREATE TABLE t97090 (
c INT NOT NULL,
l INT NOT NULL,
r INT NOT NULL,
INDEX (l ASC, c DESC),
INDEX (r ASC, c ASC)
);
INSERT INTO t97090 VALUES (1, 1, -1), (2, 1, -2)

# This query should return one row.
query III
SELECT * FROM t97090 WHERE l = 1 AND r = -1
----
1 1 -1
14 changes: 14 additions & 0 deletions pkg/sql/opt/exec/execbuilder/testdata/zigzag_join
Original file line number Diff line number Diff line change
Expand Up @@ -74,3 +74,17 @@ vectorized: true
right table: t71271@t71271_d_idx
right columns: (d, rowid)
right fixed values: 1 column

# Regression test for #97090. We should not plan a zigzag join when the
# direction of the equality columns doesn't match.
statement ok
CREATE TABLE t97090 (
c INT NOT NULL,
l INT NOT NULL,
r INT NOT NULL,
INDEX (l ASC, c DESC),
INDEX (r ASC, c ASC)
)

statement error could not produce a query plan conforming to the FORCE_ZIGZAG hint
EXPLAIN SELECT * FROM t97090@{FORCE_ZIGZAG} WHERE l = 1 AND r = -1
6 changes: 6 additions & 0 deletions pkg/sql/opt/xform/select_funcs.go
Original file line number Diff line number Diff line change
Expand Up @@ -1345,6 +1345,12 @@ func eqColsForZigzag(
}

for i < leftCnt && j < rightCnt {
// The zigzag joiner cannot handle equality columns that are not in the
// same direction. See #97090.
if leftIndex.Column(i).Descending != rightIndex.Column(j).Descending {
break
}

leftColID := tabID.IndexColumnID(leftIndex, i)
rightColID := tabID.IndexColumnID(rightIndex, j)
i++
Expand Down
29 changes: 29 additions & 0 deletions pkg/sql/opt/xform/testdata/rules/select
Original file line number Diff line number Diff line change
Expand Up @@ -6971,6 +6971,35 @@ inner-join (zigzag pqr@q pqr@s)
├── r:3 = 1 [outer=(3), constraints=(/3: [/1 - /1]; tight), fd=()-->(3)]
└── s:4 = 'foo' [outer=(4), constraints=(/4: [/'foo' - /'foo']; tight), fd=()-->(4)]

# Regression test for #97090. We should not plan a zigzag join when the
# direction of the equality columns doesn't match.
exec-ddl
CREATE TABLE t97090 (
c INT NOT NULL,
l INT NOT NULL,
r INT NOT NULL,
INDEX (l ASC, c DESC),
INDEX (r ASC, c ASC)
)
----

opt expect-not=GenerateZigzagJoins
SELECT * FROM t97090 WHERE l = 1 AND r = -1
----
select
├── columns: c:1!null l:2!null r:3!null
├── fd: ()-->(2,3)
├── index-join t97090
│ ├── columns: c:1!null l:2!null r:3!null
│ ├── fd: ()-->(2)
│ └── scan t97090@t97090_l_c_idx
│ ├── columns: c:1!null l:2!null rowid:4!null
│ ├── constraint: /2/-1/4: [/1 - /1]
│ ├── key: (4)
│ └── fd: ()-->(2), (4)-->(1)
└── filters
└── r:3 = -1 [outer=(3), constraints=(/3: [/-1 - /-1]; tight), fd=()-->(3)]

# Zigzag join should not be produced when there is a NO_ZIGZAG_JOIN hint.
opt expect-not=GenerateZigzagJoins
SELECT q,r FROM pqr@{NO_ZIGZAG_JOIN} WHERE q = 1 AND r = 2
Expand Down
14 changes: 2 additions & 12 deletions pkg/sql/sem/tree/datum.go
Original file line number Diff line number Diff line change
Expand Up @@ -6144,16 +6144,6 @@ func DatumPrev(
return nil, false
}
return NewDString(prev), true
case *DCollatedString:
prev, ok := prevString(d.Contents)
if !ok {
return nil, false
}
c, err := NewDCollatedString(prev, d.Locale, collationEnv)
if err != nil {
return nil, false
}
return c, true
case *DBytes:
prev, ok := prevString(string(*d))
if !ok {
Expand All @@ -6166,8 +6156,8 @@ func DatumPrev(
return NewDInterval(prev, types.DefaultIntervalTypeMetadata), true
default:
// TODO(yuzefovich): consider adding support for other datums that don't
// have Datum.Prev implementation (DBitArray, DGeography, DGeometry,
// DBox2D, DJSON, DArray).
// have Datum.Prev implementation (DCollatedString, DBitArray,
// DGeography, DGeometry, DBox2D, DJSON, DArray).
return datum.Prev(cmpCtx)
}
}
Expand Down

0 comments on commit ba65327

Please sign in to comment.