Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: internal error: lookup join with no lookup columns #87306

Closed
yuzefovich opened this issue Sep 1, 2022 · 5 comments · Fixed by #88491
Closed

sql: internal error: lookup join with no lookup columns #87306

yuzefovich opened this issue Sep 1, 2022 · 5 comments · Fixed by #88491
Assignees
Labels
branch-master Failures and bugs on the master branch. C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-sql-queries SQL Queries Team

Comments

@yuzefovich
Copy link
Member

yuzefovich commented Sep 1, 2022

When running SQLite logic tests with --crdb_test flag (which is currently not added in CI due to #58089) with

./dev testlogic sqlite --ignore-cache --config=local --files=slt_good_22 -- --test_env=COCKROACH_RANDOM_SEED='-6676986672122001987'
        /private/var/tmp/_bazel_yahor/e311b1c2454b7c301a6b73bddff450e7/sandbox/darwin-sandbox/2421/execroot/com_github_cockroachdb_cockroach/bazel-out/darwin-fastbuild/bin/pkg/sql/sqlitelogictest/tests/local/local_test_/local_test.runfiles/com_github_cockroachdb_cockroach/external/com_github_cockroachdb_sqllogictest/test/index/commute/10/slt_good_22.test:14596: SELECT pk FROM tab3 WHERE (col3 < 12 OR col0 BETWEEN 5 AND 30 AND (col1 >= 76.11) OR col3 = 95 OR col4 < 73.16 AND col1 IN (SELECT col4 FROM tab3 WHERE (col1 = 29.20 OR (((col0 > 3 AND ((col4 IS NULL AND (col3 IN (36,23,85,48,2,81)))))) OR col3 <= 2) AND col3 > 91 AND (col3 <= 54))) OR (col0 = 56) OR ((col4 <= 59.52)) OR col4 IN (29.33,64.35))
        expected success, but found
        (XX000) internal error: lookup join with no lookup columns
        check_expr.go:225: in CheckExpr()
        DETAIL: stack trace:
        github.com/cockroachdb/cockroach/pkg/sql/opt/memo/check_expr.go:225: CheckExpr()
        github.com/cockroachdb/cockroach/bazel-out/darwin-fastbuild/bin/pkg/sql/opt/memo/expr.og.go:24495: AddLookupJoinToGroup()
        github.com/cockroachdb/cockroach/pkg/sql/opt/xform/join_funcs.go:667: func1()
        github.com/cockroachdb/cockroach/pkg/sql/opt/xform/scan_index_iter.go:306: ForEachStartingAfter()
        github.com/cockroachdb/cockroach/pkg/sql/opt/xform/scan_index_iter.go:209: ForEach()
        github.com/cockroachdb/cockroach/pkg/sql/opt/xform/join_funcs.go:392: generateLookupJoinsImpl()
        github.com/cockroachdb/cockroach/pkg/sql/opt/xform/join_funcs.go:260: GenerateLookupJoins()
        github.com/cockroachdb/cockroach/bazel-out/darwin-fastbuild/bin/pkg/sql/opt/xform/explorer.og.go:1967: exploreSemiJoin()
        github.com/cockroachdb/cockroach/bazel-out/darwin-fastbuild/bin/pkg/sql/opt/xform/explorer.og.go:34: exploreGroupMember()
        github.com/cockroachdb/cockroach/pkg/sql/opt/xform/explorer.go:182: exploreGroup()
        github.com/cockroachdb/cockroach/pkg/sql/opt/xform/optimizer.go:529: optimizeGroup()
        github.com/cockroachdb/cockroach/pkg/sql/opt/xform/optimizer.go:291: optimizeExpr()
        github.com/cockroachdb/cockroach/pkg/sql/opt/xform/optimizer.go:571: optimizeGroupMember()
        github.com/cockroachdb/cockroach/pkg/sql/opt/xform/optimizer.go:516: optimizeGroup()
        github.com/cockroachdb/cockroach/pkg/sql/opt/xform/optimizer.go:291: optimizeExpr()
        github.com/cockroachdb/cockroach/pkg/sql/opt/xform/optimizer.go:571: optimizeGroupMember()
        github.com/cockroachdb/cockroach/pkg/sql/opt/xform/optimizer.go:516: optimizeGroup()
        github.com/cockroachdb/cockroach/pkg/sql/opt/xform/optimizer.go:291: optimizeExpr()
        github.com/cockroachdb/cockroach/pkg/sql/opt/xform/optimizer.go:571: optimizeGroupMember()
        github.com/cockroachdb/cockroach/pkg/sql/opt/xform/optimizer.go:516: optimizeGroup()
        github.com/cockroachdb/cockroach/pkg/sql/opt/xform/optimizer.go:291: optimizeExpr()
        github.com/cockroachdb/cockroach/pkg/sql/opt/xform/optimizer.go:571: optimizeGroupMember()
        github.com/cockroachdb/cockroach/pkg/sql/opt/xform/optimizer.go:516: optimizeGroup()
        github.com/cockroachdb/cockroach/pkg/sql/opt/xform/optimizer.go:291: optimizeExpr()
        github.com/cockroachdb/cockroach/pkg/sql/opt/xform/optimizer.go:571: optimizeGroupMember()
        github.com/cockroachdb/cockroach/pkg/sql/opt/xform/optimizer.go:516: optimizeGroup()
        github.com/cockroachdb/cockroach/pkg/sql/opt/xform/optimizer.go:291: optimizeExpr()
        github.com/cockroachdb/cockroach/pkg/sql/opt/xform/optimizer.go:571: optimizeGroupMember()
        github.com/cockroachdb/cockroach/pkg/sql/opt/xform/optimizer.go:516: optimizeGroup()
        github.com/cockroachdb/cockroach/pkg/sql/opt/xform/optimizer.go:291: optimizeExpr()
        github.com/cockroachdb/cockroach/pkg/sql/opt/xform/optimizer.go:571: optimizeGroupMember()
        github.com/cockroachdb/cockroach/pkg/sql/opt/xform/optimizer.go:516: optimizeGroup()

seems likely it's related to recent work by @mgartner, so I'm adding a release blocker label.

Jira issue: CRDB-19263

@yuzefovich yuzefovich added C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. branch-master Failures and bugs on the master branch. release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. labels Sep 1, 2022
@blathers-crl blathers-crl bot added the T-sql-queries SQL Queries Team label Sep 1, 2022
@mgartner
Copy link
Collaborator

mgartner commented Sep 6, 2022

Reduced reproduction:

statement ok
CREATE TABLE t (
  a INT,
  b INT,
  c INT,
  INDEX (a, c)
);

statement ok
SELECT NULL
FROM t
WHERE a IN (
  SELECT c FROM t
  WHERE a = 0 OR b IN (0) AND b > 0
)

@mgartner
Copy link
Collaborator

mgartner commented Sep 7, 2022

This occurs on v22.1.6, so this is not a release blocker.

@mgartner mgartner removed the release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. label Sep 7, 2022
@mgartner
Copy link
Collaborator

The problem is that we incorrectly determine that an index used for a lookup join is non-covering when a column in the ON filter is not in the index and conditions on that column form a contradiction that was not eliminated by normalization rules in the canonical expression. This causes us to unnecessarily plan a paired joiner to fetch the column from the primary index. As far as I can tell, there is no correctness issue here, but it does create inefficient plans with paired joiners that could be planned with just a single lookup join.

As an example, consider the reproduction above. The secondary index, INDEX (a, c), cannot provide column b so a paired joiner is planned to fetch b from the primary index after performing the lookup join with the secondary index. However, we don't actually need bb IN (0) AND b > 0 is a contradiction. The lookup join logic assumes that we do need b because this contradiction hasn't been eliminated before the lookup join is generated (our contradiction detection relies on constraints which have difficulty fully describing disjunctions; in other words, our rules can't see the contradiction inside the OR to eliminate it). The contradiction is eliminated once the a = 0 filter is removed from the ON condition (this part of the filter is replaced with a projection of a 0 value column and lookup join key columns). We're left with a paired joiner that fetches no columns - hence the check_expr.go assertion failure.

@yuzefovich
Copy link
Member Author

Is there some easy way to adjust check_expr.go to not fire until this issue is fixed so that #58089 could be resolved?

@mgartner
Copy link
Collaborator

This is easier to fix than I originally thought. Will have a PR up momentarily.

mgartner added a commit to mgartner/cockroach that referenced this issue Sep 22, 2022
This commit fixes an issue where the optimizer would plan a paired semi
or anti lookup join in cases when a single lookup join would suffice.
This only occurred in rare cases when the join filter contained a
tautology or contradiction that could not be normalized to true or false
in the canonical query plan, but could be eliminated from the filters
when building a lookup join. If the tautology or contradiction
referenced a column not covered by the lookup index, the optimizer
mistakenly assumed that the index was not covering and planned a paired
join. Now the optimizer can recognize that the index is actually
covering, because the referenced column is not needed to evaluate the
filters, and a single lookup join is planned.

Fixes cockroachdb#87306

Release note (performance improvement): The optimizer now explores a
plans with a single lookup join expressions in rare cases where it
previously planned two lookup join expressions.
craig bot pushed a commit that referenced this issue Sep 27, 2022
88076: admission: Split stats for WorkQueueMetrics r=irfansharif a=andrewbaptist

This commit splits the WorkQueueMetric stats into priority.
Because there are a lot of new stats that might be generated
as a result of this splitting, only the "important" priorities
are collected and considered for each queue.

Release note: None

88289: cdc: avoid deadlock on error in pubsub sink r=HonoreDB a=HonoreDB

#88130 introduced a deadlock when an attempt to create a
topic fails -- the goroutine tries to acquire a lock in order to record the error, but it already has it in order to write to the map. This PR releases the lock while creating the topic, which should also help with performance a bit on startup.

Fixes #85374

Release note (bug fix): Fixed a bug preventing pubsub changefeeds from retrying.

88491: opt: do not plan unnecessary paired semi- and anti- lookup joins r=mgartner a=mgartner

This commit fixes an issue where the optimizer would plan a paired semi
or anti lookup join in cases when a single lookup join would suffice.
This only occurred in rare cases when the join filter contained a
tautology or contradiction that could not be normalized to true or false
in the canonical query plan, but could be eliminated from the filters
when building a lookup join. If the tautology or contradiction
referenced a column not covered by the lookup index, the optimizer
mistakenly assumed that the index was not covering and planned a paired
join. Now the optimizer can recognize that the index is actually
covering, because the referenced column is not needed to evaluate the
filters, and a single lookup join is planned.

Fixes #87306

Release note (performance improvement): The optimizer now explores plans
with a single lookup join expressions in rare cases where it previously
planned two lookup join expressions.


88714: ttl: test multiple TTL storage param changes r=rafiss,ajwerner a=ecwall

fixes #88560

Add tests to cover multiple changes to TTL storage params.

Release note: None

88756: sql: only record index reads on non-internal sessions r=THardy98 a=THardy98

Resolves: #77598

Previously, explicit `CREATE INDEX` queries were causing unintended index reads in our `index_usage_statistics` subsystem. The unintended index reads were caused by an index validation query from an internal executor that did not use an internal planner. However, the internal executor does set its session data as internal. This change adds a check to only record index reads when the planner is not internal AND the session is not internal.

Release note (bug fix): Fixed unintended recordings of index reads caused by internal executor/queries.

88773: sql: support `SHOW GRANTS` for functions r=chengxiong-ruan a=chengxiong-ruan

Backport resolves #88495

Release note (sql change): Previously `SHOW GRANTS` only supports
db, schema, table and types. This commit add supports for UDFs,
so that `SHOW GRANTS` returns UDFs privileges infos, and statements
like `SHOW GRANTS ON FUNCTION <udf name/signatures>` are now supported
Full function signature must be provided if the function name is
not unique.
Release justification: low risk GA blocker.

88831: bench: add a benchmark for INSERT RETURNING r=yuzefovich a=yuzefovich

Informs #88525.

Release note: None

Co-authored-by: Andrew Baptist <[email protected]>
Co-authored-by: Aaron Zinger <[email protected]>
Co-authored-by: Marcus Gartner <[email protected]>
Co-authored-by: Evan Wall <[email protected]>
Co-authored-by: Thomas Hardy <[email protected]>
Co-authored-by: Chengxiong Ruan <[email protected]>
Co-authored-by: Yahor Yuzefovich <[email protected]>
@craig craig bot closed this as completed in 7af0002 Sep 27, 2022
blathers-crl bot pushed a commit that referenced this issue Sep 27, 2022
This commit fixes an issue where the optimizer would plan a paired semi
or anti lookup join in cases when a single lookup join would suffice.
This only occurred in rare cases when the join filter contained a
tautology or contradiction that could not be normalized to true or false
in the canonical query plan, but could be eliminated from the filters
when building a lookup join. If the tautology or contradiction
referenced a column not covered by the lookup index, the optimizer
mistakenly assumed that the index was not covering and planned a paired
join. Now the optimizer can recognize that the index is actually
covering, because the referenced column is not needed to evaluate the
filters, and a single lookup join is planned.

Fixes #87306

Release note (performance improvement): The optimizer now explores plans
with a single lookup join expressions in rare cases where it previously
planned two lookup join expressions.
@mgartner mgartner moved this to Done in SQL Queries Jul 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
branch-master Failures and bugs on the master branch. C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-sql-queries SQL Queries Team
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

2 participants