Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

opt: optimizer ignores NO_INDEX_JOIN in some cases #85841

Closed
rytaft opened this issue Aug 9, 2022 · 1 comment · Fixed by #85843
Closed

opt: optimizer ignores NO_INDEX_JOIN in some cases #85841

rytaft opened this issue Aug 9, 2022 · 1 comment · Fixed by #85843
Assignees
Labels
C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-sql-queries SQL Queries Team

Comments

@rytaft
Copy link
Collaborator

rytaft commented Aug 9, 2022

Describe the problem

In some cases, the optimizer may ignore a NO_INDEX_JOIN hint and create a query plan with an index join.

To Reproduce

This can be reproduced with ./cockroach demo:

> CREATE TABLE defg (
  d INT,
  e INT,
  f INT,
  g INT,
  INDEX dd (d),
  INDEX dfg (d, f, g),
  INDEX df (d, f)
);

> explain SELECT d, f, e FROM defg@{NO_INDEX_JOIN} ORDER BY d, f, e LIMIT 10;
                       info
---------------------------------------------------
  distribution: local
  vectorized: true

  • top-k
  │ order: +d,+f,+e
  │ k: 10
  │
  └── • index join
      │ table: defg@defg_pkey
      │
      └── • scan
            missing stats
            table: defg@df
            spans: FULL SCAN (SOFT LIMIT)

  index recommendations: 1
  1. type: index creation
     SQL command: CREATE INDEX ON defg (d, f, e);
(18 rows)

Expected behavior

If NO_INDEX_JOIN is specified, the optimizer should choose a plan that does not include an index join.

Environment:

  • CockroachDB 22.1
  • Server OS: MacOS
  • Client app cockroach demo

Jira issue: CRDB-18465

@rytaft rytaft added the C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. label Aug 9, 2022
@blathers-crl blathers-crl bot added the T-sql-queries SQL Queries Team label Aug 9, 2022
@rytaft rytaft self-assigned this Aug 9, 2022
@rytaft
Copy link
Collaborator Author

rytaft commented Aug 9, 2022

This is also a problem on 21.2 for other queries.

craig bot pushed a commit that referenced this issue Aug 10, 2022
84487: sql,csv: distinguish empty columns from quoted empty strings r=otan,dt a=rafiss

fixes #19743

The first commit is meant to backport. The second one maybe should not be backported.

Release note (bug fix): Previously, an empty column in the input to
COPY ... FROM CSV would be treated as an empty string. Now, this is
treated as NULL. The quoted empty string can still be used to input an
empty string, Similarly, if a different NULL token is specified in
the command options, it can be quoted in order to be treated as the
equivalent string value.

Release note (backward-incompatible change): If no `nullif` option is specified
while using IMPORT CSV, then a zero-length string in the input is now treated as
NULL. The quoted empty string in the input is treated as an empty string. Similarly,
if `nullif` is specified, then an unquoted value is treated as NULL, and a
quoted value is treated as that string. These changes were made to make IMPORT CSV
behave more similarly to COPY CSV.

If the previous behavior (i.e. treating either quoted or unquoted values
that match the `nullif` setting as NULL) is desired, then use the new
`allow_quoted_null` option in the IMPORT statement.

85773: sql/schemachanger/scgraph, scplan: fixed a bug when drawing dep graph r=Xiang-Gu a=Xiang-Gu

Previously, we define all stauses an element can be in in the
declarative schema changer in the scpb package. We removed one status
(TXN_DROPPED) previously from that list and leave its enum number as a
reserved number. However, some logic in scgraph incorrectly made the
assumption that all enum numbers are active and we can just iterate
from 0 to len(enum_list)-1 in order to iterate over all possible status,
part of the logic to draw the dep graph. This is problematic because as
we continue to add more status in that enum list, such way of iteration
will be incorrect to draw the dep graph. This PR fixes that.

This PR also spotted and fixed an panic recover bug where we forget to
correctly update the return error, causing a situation where if a panic
happens and the recover catches it, we will return with a nil error.

Release note (bug fix): Fixed a bug internal to drawing dependency
graph of a DDL statement under the declarative schema changer.

85781: sql/schemachanger/scexec: fixed a bug in executing validation operations r=Xiang-Gu a=Xiang-Gu

Previously, when we have a stage of validation opearations in the
declarative schema changer, we incorrectly only perform the first
validation operation and skip the rest. This is problematic because it's
quite possible for a stage to have >1 validation operations. This PR
fixes it.

In a future PR, if the number of validation operation starts to increase
significantly, we should employ the same 'visitor' pattern as we did for
the mutation operations. Currently, we simply have a 'switch' statement
for the two validation operations we support (validateUniqueIndex and
validateCheckConstraint).

Release note (bug fix): Fixed a bug where we incorrectly only handle
the first validation operation and skip the rest in a stage of
validation operations in the declarative schema changer.

85843: opt: respect NO_INDEX_JOIN flag r=rytaft a=rytaft

Prior to this commit, it was possible that the optimizer could
produce a plan with an index join even if the user hinted that
index joins should be avoided by using the `NO_INDEX_JOIN` hint. This
commit fixes that oversight, and we no longer plan an index join
in this case. This commit also adds assertions that an index join
is not planned if `NO_INDEX_JOIN` is used to prevent this bug from
recurring.

Fixes #85841

Release note (bug fix): Fixed an issue where the `NO_INDEX_JOIN`
hint could be ignored by the optimizer in some cases, causing it
to create a query plan with an index join.

Co-authored-by: Rafi Shamim <[email protected]>
Co-authored-by: Xiang Gu <[email protected]>
Co-authored-by: Rebecca Taft <[email protected]>
@craig craig bot closed this as completed in c3da919 Aug 10, 2022
blathers-crl bot pushed a commit that referenced this issue Aug 10, 2022
Prior to this commit, it was possible that the optimizer could
produce a plan with an index join even if the user hinted that
index joins should be avoided by using the NO_INDEX_JOIN hint. This
commit fixes that oversight, and we no longer plan an index join
in this case. This commit also adds assertions that an index join
is not planned if NO_INDEX_JOIN is used to prevent this bug from
recurring.

Fixes #85841

Release note (bug fix): Fixed an issue where the NO_INDEX_JOIN
hint could be ignored by the optimizer in some cases, causing it
to create a query plan with an index join.
rytaft added a commit to rytaft/cockroach that referenced this issue Aug 13, 2022
Prior to this commit, it was possible that the optimizer could
produce a plan with an index join even if the user hinted that
index joins should be avoided by using the NO_INDEX_JOIN hint. This
commit fixes that oversight, and we no longer plan an index join
in this case. This commit also adds assertions that an index join
is not planned if NO_INDEX_JOIN is used to prevent this bug from
recurring.

Fixes cockroachdb#85841

Release note (bug fix): Fixed an issue where the NO_INDEX_JOIN
hint could be ignored by the optimizer in some cases, causing it
to create a query plan with an index join.
rytaft added a commit to rytaft/cockroach that referenced this issue Aug 15, 2022
Prior to this commit, it was possible that the optimizer could
produce a plan with an index join even if the user hinted that
index joins should be avoided by using the NO_INDEX_JOIN hint. This
commit fixes that oversight, and we no longer plan an index join
in this case. This commit also adds assertions that an index join
is not planned if NO_INDEX_JOIN is used to prevent this bug from
recurring.

Fixes cockroachdb#85841

Release note (bug fix): Fixed an issue where the NO_INDEX_JOIN
hint could be ignored by the optimizer in some cases, causing it
to create a query plan with an index join.
@mgartner mgartner moved this to Done in SQL Queries Jul 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-sql-queries SQL Queries Team
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

1 participant