opt: choosing suboptimal plan on TPCH query 6 after stats collected on `lineitem` #46677

yuzefovich · 2020-03-27T16:06:56Z

I've been digging into a possible performance regression with vectorized engine here and noticed that the query runtimes significantly increase after the collection of statistics on lineitem table. I confirmed that it is both present on master and 19.2.

On a 3 node cluster on my laptop:
before stats are collected:

Yahors-MacBook-Pro:cockroach yuzefovich$ ./bin/workload run tpch --concurrency=1 --db=tpch --max-ops=5 --queries=6 --vectorize=on
I200327 15:57:48.821707 67 workload/tpch/tpch.go:465  [q6] returned 1 rows after 2.35 seconds
I200327 15:57:50.937833 67 workload/tpch/tpch.go:465  [q6] returned 1 rows after 2.12 seconds
I200327 15:57:53.020603 67 workload/tpch/tpch.go:465  [q6] returned 1 rows after 2.08 seconds
I200327 15:57:55.188724 67 workload/tpch/tpch.go:465  [q6] returned 1 rows after 2.17 seconds
I200327 15:57:57.314714 67 workload/tpch/tpch.go:465  [q6] returned 1 rows after 2.13 seconds

after stats are collected:

Yahors-MacBook-Pro:cockroach yuzefovich$ ./bin/workload run tpch --concurrency=1 --db=tpch --max-ops=5 --queries=6 --vectorize=on
I200327 15:58:59.760006 45 workload/tpch/tpch.go:465  [q6] returned 1 rows after 11.99 seconds
I200327 15:59:11.756195 45 workload/tpch/tpch.go:465  [q6] returned 1 rows after 12.00 seconds
I200327 15:59:23.947457 45 workload/tpch/tpch.go:465  [q6] returned 1 rows after 12.19 seconds
I200327 15:59:36.167843 45 workload/tpch/tpch.go:465  [q6] returned 1 rows after 12.22 seconds
I200327 15:59:48.192814 45 workload/tpch/tpch.go:465  [q6] returned 1 rows after 12.02 seconds

Plan before:

  group          |             |                                                                                                                                               
   │             | aggregate 0 | sum(column17)                                                                                                                                 
   │             | scalar      |                                                                                                                                               
   └── render    |             |                                                                                                                                               
        └── scan |             |                                                                                                                                               
                 | table       | lineitem@primary                                                                                                                              
                 | spans       | ALL                                                                                                                                           
                 | filter      | (((l_discount >= 0.05) AND (l_discount <= 0.07)) AND ((l_shipdate >= '1994-01-01') AND (l_shipdate < '1995-01-01'))) AND (l_quantity < 24.0)  
(10 rows)

After:

  group                     |             |                                                                          
   │                        | aggregate 0 | sum(column17)                                                            
   │                        | scalar      |                                                                          
   └── render               |             |                                                                          
        └── filter          |             |                                                                          
             │              | filter      | ((l_discount >= 0.05) AND (l_discount <= 0.07)) AND (l_quantity < 24.0)  
             └── index-join |             |                                                                          
                  │         | table       | lineitem@primary                                                         
                  └── scan  |             |                                                                          
                            | table       | lineitem@l_sd                                                            
                            | spans       | /8766-/9131

This issue can be reproduced on a single node local cluster by restoring TPCH dataset for Scale factor of 1. Please let me know if any more info is needed.

Jira issue: CRDB-5069

The text was updated successfully, but these errors were encountered:

RaduBerinde · 2020-03-28T02:45:07Z

If you still have the setup, could you run the same experiment with vectorized off? Trying to ascertain whether inaccurate costing with vectorized plays a part.

yuzefovich · 2020-03-28T02:56:31Z

Sure. I reran the query on 3 node cluster on my laptop.

Without stats

Yahors-MacBook-Pro:cockroach yuzefovich$ ./bin/workload run tpch --concurrency=1 --db=tpch --max-ops=3 --queries=6 --vectorize=off
I200328 02:54:03.882375 44 workload/tpch/tpch.go:465  [q6] returned 1 rows after 6.24 seconds
I200328 02:54:10.124262 44 workload/tpch/tpch.go:465  [q6] returned 1 rows after 6.24 seconds
I200328 02:54:16.354978 44 workload/tpch/tpch.go:465  [q6] returned 1 rows after 6.23 seconds

With stats

Yahors-MacBook-Pro:cockroach yuzefovich$ ./bin/workload run tpch --concurrency=1 --db=tpch --max-ops=3 --queries=6 --vectorize=off
I200328 02:52:47.072605 42 workload/tpch/tpch.go:465  [q6] returned 1 rows after 12.17 seconds
I200328 02:52:58.714868 42 workload/tpch/tpch.go:465  [q6] returned 1 rows after 11.64 seconds
I200328 02:53:10.525273 42 workload/tpch/tpch.go:465  [q6] returned 1 rows after 11.81 seconds

RaduBerinde · 2020-03-28T03:17:58Z

Thanks, so it looks like vectorized is responsible for a 3x improvement in the former case, and the optimizer doesn't know about that. But we're still 2x slower with stats though which is unfortunate. From our stats_quality tests, we seem to do a very good job at estimating the number of joined rows. It might be a problem with costing scans, where we think the primary scan is too expensive because we have a lot of columns (#46560).

Indeed, I added the query with a @primary hint to xform/external/tpch. The current (index join) plan:

scalar-group-by
 ├── cost: 4954408.53
 ├── project
 │    ├── cost: 4954061.04
 │    ├── select
 │    │    ├── cost: 4953366.12
 │    │    ├── index-join lineitem
 │    │    │    ├── cost: 4943984.73
 │    │    │    └── scan lineitem@l_sd
 │    │    │         ├── constraint: /11/1/4: [/'1994-01-01' - /'1994-12-31']
 │    │    │         └── cost: 994425.777
 │    │    └── filters
 │    │         ├── (l_discount >= 0.05) AND (l_discount <= 0.07)
 │    │         └── l_quantity < 24.0
 │    └── projections
 │         └── l_extendedprice * l_discount
 └── aggregations
      └── sum
           └── column17

No index join:

scalar-group-by
 ├── cost: 7262512.6
 ├── project
 │    ├── cost: 7262165.11
 │    ├── select
 │    │    ├── cost: 7261470.18
 │    │    ├── scan lineitem
 │    │    │    ├── flags: force-index=primary
 │    │    │    └── cost: 7201458.02
 │    │    └── filters
 │    │         ├── (l_discount >= 0.05) AND (l_discount <= 0.07)
 │    │         ├── (l_shipdate >= '1994-01-01') AND (l_shipdate < '1995-01-01')
 │    │         └── l_quantity < 24.0
 │    └── projections
 │         └── l_extendedprice * l_discount
 └── aggregations
      └── sum
           └── column17

I think the cost for a full primary scan is exaggerated here. I think the best course of action is to work on #46560 and then revisit and check if it fixes the problem.

rytaft · 2020-08-07T12:29:04Z

@yuzefovich Is this still an issue? I know you tested TPC-H recently and got good results thanks to join ordering and execution improvements.

yuzefovich · 2020-08-10T19:32:07Z

@rytaft this issue still appears to be present:

no stats

Yahors-MacBook-Pro:cockroach yuzefovich$ ./bin/workload run tpch --concurrency=1 --db=tpch --max-ops=3 --queries=6
I200810 19:27:24.389685 1 workload/cli/run.go:356  creating load generator...
I200810 19:27:24.390168 1 workload/cli/run.go:387  creating load generator... done (took 487µs)
I200810 19:27:30.340320 67 workload/tpch/tpch.go:479  [q6] returned 1 rows after 5.95 seconds
I200810 19:27:36.261581 67 workload/tpch/tpch.go:479  [q6] returned 1 rows after 5.92 seconds
I200810 19:27:42.276520 67 workload/tpch/tpch.go:479  [q6] returned 1 rows after 6.01 seconds

with stats

Yahors-MacBook-Pro:cockroach yuzefovich$ ./bin/workload run tpch --concurrency=1 --db=tpch --max-ops=3 --queries=6
I200810 19:30:16.209656 1 workload/cli/run.go:356  creating load generator...
I200810 19:30:16.210142 1 workload/cli/run.go:387  creating load generator... done (took 489µs)
I200810 19:30:25.707702 36 workload/tpch/tpch.go:479  [q6] returned 1 rows after 9.50 seconds
I200810 19:30:35.166743 36 workload/tpch/tpch.go:479  [q6] returned 1 rows after 9.46 seconds
I200810 19:30:44.701479 36 workload/tpch/tpch.go:479  [q6] returned 1 rows after 9.53 seconds

rytaft · 2020-09-21T20:57:44Z

I did some more investigation here. It seems like our scan costing is actually not too bad. For the secondary index scan, we estimate a cost of 981416, while for the primary index scan, we estimate a cost of 7201462 (~7x larger). Based on the EXPLAIN ANALYZE output, the secondary index scan takes 532 ms, while the primary index scan takes 6.52 s (~12x larger). If anything, the cost ratio favors the primary index scan more than it should.

The real problem, however, is the costing of the index join. The total cost of the index join is 4879288, of which 981416 comes from the scan, leaving 3897872 for the join operation (about half the cost of the primary index scan). But the EXPLAIN ANALYZE linked above shows that it actually takes 21.87 seconds, which is over 3x longer than the primary index scan.

Part of the problem is that we're not costing index joins the same way we cost lookup joins. If I change the index join costing code to be closer to what we do for lookup joins, then the cost is larger: 6721750 instead of 4879288. Still, it's not quite enough to cause the optimizer to prefer the primary index scan.

Prior to this commit, the cost of an index join was much less than the cost of an equivalent lookup join. Since index joins have the same implementation as lookup joins when the lookup columns form a key in the lookup table, the two operators should have the same cost. This commit updates the coster so that this is the case. Informs cockroachdb#46677 Release note (performance improvement): Updated the cost model in the optimizer to make index joins more expensive and better reflect the reality of their cost. As a result, the optimizer will choose index joins less frequently, generally resulting in more efficient query plans.

54768: opt: update cost of index join and statistics of array contains operation r=rytaft a=rytaft **opt: update statistics of array contains operation to match json contains** Prior to this commit, the selectivity of `j @> '{"a": "b"}'::json` was calculated as `1/9`, while the selectivity of `a @> array[1]` was calculated as `1/3`. Both of these predicates essentially represent one "path" that can be queried in an inverted index, so they should have the same selectivity. Furthermore, multiple paths should be more selective, so in the same way that the predicate `j @> '{"a": "b", "c": "d"}'::json` has selectivity `(1/9 * 1/9)`, `a @> array[1,2]` should similarly have selectivity `(1/9 * 1/9)`. This commit updates the logic in the statistics builder so that this is the case. Release note (performance improvement): Improved the selectivity estimate for array contains predicates (e.g., `arr @> ARRAY[1]`) in the optimizer. This improves the optimizer's cardinality estimation for queries containing these predicates, and may result in better query plans in some cases. **opt: update cost of index join to match cost of lookup join** Prior to this commit, the cost of an index join was much less than the cost of an equivalent lookup join. Since index joins have the same implementation as lookup joins when the lookup columns form a key in the lookup table, the two operators should have the same cost. This commit updates the coster so that this is the case. Informs #46677 Release note (performance improvement): Updated the cost model in the optimizer to make index joins more expensive and better reflect the reality of their cost. As a result, the optimizer will choose index joins less frequently, generally resulting in more efficient query plans. 54772: workload: bump the tpcc fixture version; create statistics when making fixtures r=RaduBerinde a=RaduBerinde #### workload: bump the tpcc fixture version The TPCC fixtures need refreshing: they contain FK indexes that we no longer need, and they are missing stats for some of the tables. This commit bumps the fixture version and incorporates the `deprecated-fk-indexes` flag into the fixture name. Release note: None #### workload: create statistics when making fixtures `workload make` imports a workload and then creates the backup (that can be later used by `workload load`). This change adds functionality to call CREATE STATISTICS on all tables before creating the backup. This can be turned off using a flag. Release note: None I am in the process of regenerating the fixtures with the new version. Informs #54702. Informs #50911. 54784: deps: bump cockroachdb/redact r=andreimatei a=knz To pick up the pointer formatting fix from cockroachdb/redact#9 Release note: None 54804: sql: skip TestSchemaChangeEvalContext r=yuzefovich a=yuzefovich Refs: #54775 Reason: flaky test Generated by bin/skip-test. Release justification: non-production code changes Release note: None Co-authored-by: Rebecca Taft <[email protected]> Co-authored-by: Radu Berinde <[email protected]> Co-authored-by: Raphael 'kena' Poss <[email protected]> Co-authored-by: Yahor Yuzefovich <[email protected]>

Prior to this commit, the cost of an index join was much less than the cost of an equivalent lookup join. Since index joins have the same implementation as lookup joins when the lookup columns form a key in the lookup table, the two operators should have the same cost. This commit updates the coster so that this is the case. Informs cockroachdb#46677 Release note (performance improvement): Updated the cost model in the optimizer to make index joins more expensive and better reflect the reality of their cost. As a result, the optimizer will choose index joins less frequently, generally resulting in more efficient query plans.

wenyihu6 · 2022-08-17T20:20:41Z

This issue is still present. The optimizer is favouring index join over primary index scan, but index join ended up being more expensive than primary index scan.

Here are the statement bundles: debug-primaryindexscan.zip and debug-indexjoin.zip. The Row estimation is not bad. It could be because the assigned cost for primary index scan is too high or the cost for index join is still not high enough.

github-actions · 2024-02-15T11:05:41Z

We have marked this issue as stale because it has been inactive for
18 months. If this issue is still relevant, removing the stale label
or adding a comment will keep it active. Otherwise, we'll close it in
10 days to keep the issue queue tidy. Thank you for your contribution
to CockroachDB!

yuzefovich mentioned this issue Mar 28, 2020

opt: choosing worse plans with stats present on TPC-H queries than without stats #46701

Open

6 tasks

RaduBerinde assigned rytaft Mar 28, 2020

rytaft mentioned this issue Sep 24, 2020

opt: update cost of index join and statistics of array contains operation #54768

Merged

rytaft mentioned this issue Nov 12, 2020

opt, sql: improve automatic statistics for small tables #56615

Open

jlinder added the T-sql-queries SQL Queries Team label Jun 16, 2021

rytaft mentioned this issue Jul 13, 2021

release-20.2: opt: update cost of index join and statistics of array contains operation #67530

Merged

rytaft removed their assignment Aug 23, 2022

mgartner added this to SQL Queries Jul 24, 2023

mgartner moved this to Backlog (DO NOT ADD NEW ISSUES) in SQL Queries Jul 24, 2023

github-actions bot added the no-issue-activity label Feb 15, 2024

yuzefovich added C-performance Perf of queries or internals. Solution not expected to change functional behavior. and removed no-issue-activity labels Feb 15, 2024

mgartner moved this from Backlog (DO NOT ADD NEW ISSUES) to New Backlog in SQL Queries Apr 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

opt: choosing suboptimal plan on TPCH query 6 after stats collected on `lineitem` #46677

opt: choosing suboptimal plan on TPCH query 6 after stats collected on `lineitem` #46677

yuzefovich commented Mar 27, 2020 •

edited by cockroach-jira-scripts

Loading

RaduBerinde commented Mar 28, 2020

yuzefovich commented Mar 28, 2020 •

edited

Loading

RaduBerinde commented Mar 28, 2020

rytaft commented Aug 7, 2020

yuzefovich commented Aug 10, 2020

rytaft commented Sep 21, 2020

wenyihu6 commented Aug 17, 2022

github-actions bot commented Feb 15, 2024

opt: choosing suboptimal plan on TPCH query 6 after stats collected on lineitem #46677

opt: choosing suboptimal plan on TPCH query 6 after stats collected on lineitem #46677

Comments

yuzefovich commented Mar 27, 2020 • edited by cockroach-jira-scripts Loading

RaduBerinde commented Mar 28, 2020

yuzefovich commented Mar 28, 2020 • edited Loading

RaduBerinde commented Mar 28, 2020

rytaft commented Aug 7, 2020

yuzefovich commented Aug 10, 2020

rytaft commented Sep 21, 2020

wenyihu6 commented Aug 17, 2022

github-actions bot commented Feb 15, 2024

opt: choosing suboptimal plan on TPCH query 6 after stats collected on `lineitem` #46677

opt: choosing suboptimal plan on TPCH query 6 after stats collected on `lineitem` #46677

yuzefovich commented Mar 27, 2020 •

edited by cockroach-jira-scripts

Loading

yuzefovich commented Mar 28, 2020 •

edited

Loading