-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
unsupported TPC-DS queries tracking issue #37464
Comments
46279: sql, workload, compose: miscellaneous cleanups r=yuzefovich a=yuzefovich **sql: add stddev_samp alias for stddev aggregate builtin** Release justification: low-risk new functionality. Addresses: #37464. Release note (sql change): CockroachDB now supports `stddev_samp` aggregate builtin function which is the same as `stddev` (actually, the latter is the historical alias of the former, according to Postgres documentation). **workload, compose: miscellaneous cleanups** Release justification: non-production code changes. This commit cleans up a few things: 1. we recently renamed `experimental_on` vectorize setting to `on`, but a couple of places were missed. 2. compose-compare test (of randomized land) was skipping some queries due to bugs which have been fixed. 3. `tpcds` workload has been slightly enhanced (added `vectorize` option and refactored the way statement timeout is set). Release note: None Co-authored-by: Yahor Yuzefovich <[email protected]>
Updating the issue since
|
Recent runs have shown that we have regressed between 9ba1a81 and b32bbb5 on the following queries:
|
For Q6, disabling the vectorized hash aggregator causes a significant speedup. Nearly all the time for the vectorized hash aggregator is spent in |
A possible solution might be to have the external aggregator perform a pass with the in-memory aggregator for each batch before partitioning, and then only partition the tuples that don't fit into the existing buckets. |
With regard to Q64, changing the join reorder limit to 10 joins gets the runtime down to a manageable ~17 seconds on my machine. Though, changing it to 20 brings it back up to ~1 minute - maybe because of overestimating predicate selectivity? I'm not sure it's a good idea to increase the default join reorder limit, but it might be worth exploring left-deep trees for some number of joins beyond the limit, maybe proportional to the ratio between the number of bushy and the number of left-deep trees. |
Updated the numbers using 23.1.0-alpha.7. Notable (at least 33%) speedups since the last update:
Notable (at least 33%) slowdowns:
|
This issue tracks features that are unimplemented but needed for TPC-DS queries (also some of the queries that didn't finish within 5 minutes on scale factor 1 - 1 GB of data - on a 3 node roachprod cluster).
unknown function: rollup()
sql: add support for GROUPING SETS, CUBE, and ROLLUP #46280pq: unknown function: stddev_samp()
The queries were run one at a time, and here are the runtimes of all queries that completed within 5 minutes:
Features that are missing "native" support in the vectorized engine:
Jira issue: CRDB-4428
The text was updated successfully, but these errors were encountered: