opt: optimizer ignores NO_INDEX_JOIN in some cases #85841

rytaft · 2022-08-09T19:33:11Z

Describe the problem

In some cases, the optimizer may ignore a NO_INDEX_JOIN hint and create a query plan with an index join.

To Reproduce

This can be reproduced with ./cockroach demo:

> CREATE TABLE defg (
  d INT,
  e INT,
  f INT,
  g INT,
  INDEX dd (d),
  INDEX dfg (d, f, g),
  INDEX df (d, f)
);

> explain SELECT d, f, e FROM defg@{NO_INDEX_JOIN} ORDER BY d, f, e LIMIT 10;
                       info
---------------------------------------------------
  distribution: local
  vectorized: true

  • top-k
  │ order: +d,+f,+e
  │ k: 10
  │
  └── • index join
      │ table: defg@defg_pkey
      │
      └── • scan
            missing stats
            table: defg@df
            spans: FULL SCAN (SOFT LIMIT)

  index recommendations: 1
  1. type: index creation
     SQL command: CREATE INDEX ON defg (d, f, e);
(18 rows)

Expected behavior

If NO_INDEX_JOIN is specified, the optimizer should choose a plan that does not include an index join.

Environment:

CockroachDB 22.1
Server OS: MacOS
Client app cockroach demo

Jira issue: CRDB-18465

The text was updated successfully, but these errors were encountered:

rytaft · 2022-08-09T19:35:30Z

This is also a problem on 21.2 for other queries.

84487: sql,csv: distinguish empty columns from quoted empty strings r=otan,dt a=rafiss fixes #19743 The first commit is meant to backport. The second one maybe should not be backported. Release note (bug fix): Previously, an empty column in the input to COPY ... FROM CSV would be treated as an empty string. Now, this is treated as NULL. The quoted empty string can still be used to input an empty string, Similarly, if a different NULL token is specified in the command options, it can be quoted in order to be treated as the equivalent string value. Release note (backward-incompatible change): If no `nullif` option is specified while using IMPORT CSV, then a zero-length string in the input is now treated as NULL. The quoted empty string in the input is treated as an empty string. Similarly, if `nullif` is specified, then an unquoted value is treated as NULL, and a quoted value is treated as that string. These changes were made to make IMPORT CSV behave more similarly to COPY CSV. If the previous behavior (i.e. treating either quoted or unquoted values that match the `nullif` setting as NULL) is desired, then use the new `allow_quoted_null` option in the IMPORT statement. 85773: sql/schemachanger/scgraph, scplan: fixed a bug when drawing dep graph r=Xiang-Gu a=Xiang-Gu Previously, we define all stauses an element can be in in the declarative schema changer in the scpb package. We removed one status (TXN_DROPPED) previously from that list and leave its enum number as a reserved number. However, some logic in scgraph incorrectly made the assumption that all enum numbers are active and we can just iterate from 0 to len(enum_list)-1 in order to iterate over all possible status, part of the logic to draw the dep graph. This is problematic because as we continue to add more status in that enum list, such way of iteration will be incorrect to draw the dep graph. This PR fixes that. This PR also spotted and fixed an panic recover bug where we forget to correctly update the return error, causing a situation where if a panic happens and the recover catches it, we will return with a nil error. Release note (bug fix): Fixed a bug internal to drawing dependency graph of a DDL statement under the declarative schema changer. 85781: sql/schemachanger/scexec: fixed a bug in executing validation operations r=Xiang-Gu a=Xiang-Gu Previously, when we have a stage of validation opearations in the declarative schema changer, we incorrectly only perform the first validation operation and skip the rest. This is problematic because it's quite possible for a stage to have >1 validation operations. This PR fixes it. In a future PR, if the number of validation operation starts to increase significantly, we should employ the same 'visitor' pattern as we did for the mutation operations. Currently, we simply have a 'switch' statement for the two validation operations we support (validateUniqueIndex and validateCheckConstraint). Release note (bug fix): Fixed a bug where we incorrectly only handle the first validation operation and skip the rest in a stage of validation operations in the declarative schema changer. 85843: opt: respect NO_INDEX_JOIN flag r=rytaft a=rytaft Prior to this commit, it was possible that the optimizer could produce a plan with an index join even if the user hinted that index joins should be avoided by using the `NO_INDEX_JOIN` hint. This commit fixes that oversight, and we no longer plan an index join in this case. This commit also adds assertions that an index join is not planned if `NO_INDEX_JOIN` is used to prevent this bug from recurring. Fixes #85841 Release note (bug fix): Fixed an issue where the `NO_INDEX_JOIN` hint could be ignored by the optimizer in some cases, causing it to create a query plan with an index join. Co-authored-by: Rafi Shamim <[email protected]> Co-authored-by: Xiang Gu <[email protected]> Co-authored-by: Rebecca Taft <[email protected]>

Prior to this commit, it was possible that the optimizer could produce a plan with an index join even if the user hinted that index joins should be avoided by using the NO_INDEX_JOIN hint. This commit fixes that oversight, and we no longer plan an index join in this case. This commit also adds assertions that an index join is not planned if NO_INDEX_JOIN is used to prevent this bug from recurring. Fixes #85841 Release note (bug fix): Fixed an issue where the NO_INDEX_JOIN hint could be ignored by the optimizer in some cases, causing it to create a query plan with an index join.

Prior to this commit, it was possible that the optimizer could produce a plan with an index join even if the user hinted that index joins should be avoided by using the NO_INDEX_JOIN hint. This commit fixes that oversight, and we no longer plan an index join in this case. This commit also adds assertions that an index join is not planned if NO_INDEX_JOIN is used to prevent this bug from recurring. Fixes cockroachdb#85841 Release note (bug fix): Fixed an issue where the NO_INDEX_JOIN hint could be ignored by the optimizer in some cases, causing it to create a query plan with an index join.

rytaft added the C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. label Aug 9, 2022

blathers-crl bot added the T-sql-queries SQL Queries Team label Aug 9, 2022

rytaft self-assigned this Aug 9, 2022

rytaft mentioned this issue Aug 9, 2022

opt: respect NO_INDEX_JOIN flag #85843

Merged

craig bot closed this as completed in c3da919 Aug 10, 2022

blathers-crl bot mentioned this issue Aug 10, 2022

release-22.1: opt: respect NO_INDEX_JOIN flag #85917

Merged

rytaft mentioned this issue Aug 13, 2022

release-21.2: opt: respect NO_INDEX_JOIN flag #86089

Merged

mgartner added this to SQL Queries Jul 24, 2023

mgartner moved this to Done in SQL Queries Jul 24, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

opt: optimizer ignores NO_INDEX_JOIN in some cases #85841

opt: optimizer ignores NO_INDEX_JOIN in some cases #85841

rytaft commented Aug 9, 2022 •

edited

Loading

rytaft commented Aug 9, 2022

opt: optimizer ignores NO_INDEX_JOIN in some cases #85841

opt: optimizer ignores NO_INDEX_JOIN in some cases #85841

Comments

rytaft commented Aug 9, 2022 • edited Loading

rytaft commented Aug 9, 2022

rytaft commented Aug 9, 2022 •

edited

Loading