Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

opt: improve cardinality estimate for lookup join that follows an inverted join #55448

Open
Tracked by #59331
sumeerbhola opened this issue Oct 12, 2020 · 1 comment
Open
Tracked by #59331
Labels
C-performance Perf of queries or internals. Solution not expected to change functional behavior. T-sql-queries SQL Queries Team

Comments

@sumeerbhola
Copy link
Collaborator

sumeerbhola commented Oct 12, 2020

For an expression with an OR, the cardinality estimate for the lookup join does not realize that the inverted join is producing rows with the PK for the lookup, so the number of rows cannot increase. The following (for a table without stats) shows the rows increase from 10,000 to 333,333. If one changes the expression to ST_Intersects(rtable.geom, ltable.geom1) AND ST_DWithin(ltable.geom1, rtable.geom, 2), the rows do not increase. In this case, one needs to force usage of the inverted index otherwise the optimizer prefers a cross join.

EXPLAIN (OPT, VERBOSE)
SELECT lk, rk FROM ltable LEFT JOIN rtable@geom_index
ON ST_Intersects(rtable.geom, ltable.geom1) OR ST_DWithin(ltable.geom1, rtable.geom, 2) ORDER BY (lk, rk)
----
sort (segmented)
 ├── columns: lk:1 rk:6
 ├── immutable
 ├── stats: [rows=333333.333, distinct(1)=1000, null(1)=0]
 ├── cost: 168642.882
 ├── key: (1,6)
 ├── ordering: +1,+6
 ├── prune: (1,6)
 ├── interesting orderings: (+1) (+6)
 └── project
      ├── columns: lk:1 rk:6
      ├── immutable
      ├── stats: [rows=333333.333, distinct(1)=1000, null(1)=0]
      ├── cost: 106037.393
      ├── key: (1,6)
      ├── ordering: +1
      ├── prune: (1,6)
      ├── interesting orderings: (+1) (+6)
      └── left-join (lookup rtable)
           ├── columns: lk:1 geom1:2 rk:6 geom:7
           ├── key columns: [6] = [6]
           ├── lookup columns are key
           ├── immutable
           ├── stats: [rows=333333.333, distinct(1)=1000, null(1)=0, distinct(7)=100, null(7)=3333.33333]
           ├── cost: 102704.05
           ├── key: (1,6)
           ├── fd: (1)-->(2), (6)-->(7)
           ├── ordering: +1
           ├── interesting orderings: (+1) (+6)
           ├── left-join (inverted-lookup rtable@geom_index)
           │    ├── columns: lk:1 geom1:2 rk:6 continuation:12
           │    ├── inverted-expr
           │    │    └── st_intersects(geom1:2, geom:7) OR st_dwithin(geom1:2, geom:7, 2.0)
           │    ├── stats: [rows=10000, distinct(1)=1000, null(1)=0]
           │    ├── cost: 42004.03
           │    ├── key: (1,6)
           │    ├── fd: (1)-->(2), (6)-->(12)
           │    ├── ordering: +1
           │    ├── scan ltable
           │    │    ├── columns: lk:1 geom1:2
           │    │    ├── stats: [rows=1000, distinct(1)=1000, null(1)=0, distinct(2)=100, null(2)=10]
           │    │    ├── cost: 1104.02
           │    │    ├── key: (1)
           │    │    ├── fd: (1)-->(2)
           │    │    ├── ordering: +1
           │    │    ├── prune: (1,2)
           │    │    ├── interesting orderings: (+1)
           │    │    └── unfiltered-cols: (1-5)
           │    └── filters (true)
           └── filters
                └── st_intersects(geom:7, geom1:2) OR st_dwithin(geom1:2, geom:7, 2.0) [outer=(2,7), immutable, constraints=(/2: (/NULL - ]; /7: (/NULL - ])]

Epic CRDB-16930

Jira issue: CRDB-3663

Jira issue: CRDB-13902

@sumeerbhola sumeerbhola added the C-performance Perf of queries or internals. Solution not expected to change functional behavior. label Oct 12, 2020
@jlinder jlinder added the T-sql-queries SQL Queries Team label Jun 16, 2021
Copy link

We have marked this issue as stale because it has been inactive for
18 months. If this issue is still relevant, removing the stale label
or adding a comment will keep it active. Otherwise, we'll close it in
10 days to keep the issue queue tidy. Thank you for your contribution
to CockroachDB!

@rytaft rytaft removed their assignment Dec 18, 2023
@github-project-automation github-project-automation bot moved this to Triage in SQL Queries Dec 18, 2023
@rytaft rytaft moved this from Triage to New Backlog in SQL Queries Dec 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-performance Perf of queries or internals. Solution not expected to change functional behavior. T-sql-queries SQL Queries Team
Projects
Status: Backlog
Development

No branches or pull requests

3 participants