release-21.1: opt: add cost penalty for scans with large cardinality #67388

rytaft · 2021-07-08T20:27:39Z

Backport 2/2 commits from #66979.

/cc @cockroachdb/release

opt: ensure we prefer a reverse scan to sorting a forward scan

This commit fixes an issue where in some edge cases the optimizer would
prefer sorting the output of a forward scan over performing a reverse scan
(when there is no need to sort the output of the reverse scan).

Release note (performance improvement): The optimizer now prefers
performing a reverse scan over a forward scan + sort if the reverse
scan eliminates the need for a sort and the plans are otherwise
equivalent. This was the case before in most cases, but some edge
cases with a small number of rows have been fixed.

opt: add cost penalty for scans with large cardinality

This commit adds a new cost function, largeCardinalityRowCountPenalty,
which calculates a penalty that should be added to the row count of scans.
It is non-zero for expressions with unbounded maximum cardinality or with
maximum cardinality exceeding the row count estimate. Adding a few rows
worth of cost helps prevent surprising plans for very small tables or for
when stats are stale.

Fixes #64570

Release note (performance improvement): When choosing between index
scans that are estimated to have the same number of rows, the optimizer
now prefers indexes for which it has higher certainty about the maximum
number of rows over indexes for which there is more uncertainty in the
estimated row count. This helps to avoid choosing suboptimal plans for
small tables or if the statistics are stale.

This commit fixes an issue where in some edge cases the optimizer would prefer sorting the output of a forward scan over performing a reverse scan (when there is no need to sort the output of the reverse scan). Release note (performance improvement): The optimizer now prefers performing a reverse scan over a forward scan + sort if the reverse scan eliminates the need for a sort and the plans are otherwise equivalent. This was the case before in most cases, but some edge cases with a small number of rows have been fixed.

This commit adds a new cost function, largeCardinalityRowCountPenalty, which calculates a penalty that should be added to the row count of scans. It is non-zero for expressions with unbounded maximum cardinality or with maximum cardinality exceeding the row count estimate. Adding a few rows worth of cost helps prevent surprising plans for very small tables or for when stats are stale. Fixes cockroachdb#64570 Release note (performance improvement): When choosing between index scans that are estimated to have the same number of rows, the optimizer now prefers indexes for which it has higher certainty about the maximum number of rows over indexes for which there is more uncertainty in the estimated row count. This helps to avoid choosing suboptimal plans for small tables or if the statistics are stale.

cockroach-teamcity · 2021-07-08T20:27:47Z

This change is

rytaft added 2 commits July 8, 2021 15:02

rytaft requested review from mgartner, cucaroach and a team July 8, 2021 20:27

RaduBerinde approved these changes Jul 9, 2021

View reviewed changes

rytaft merged commit e0986e3 into cockroachdb:release-21.1 Jul 9, 2021

rytaft deleted the backport21.1-66979 branch July 9, 2021 13:53

rytaft mentioned this pull request Aug 16, 2021

release-21.1: opt: adjust cost of scan with unbounded cardinality to avoid bad plans #68991

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

release-21.1: opt: add cost penalty for scans with large cardinality #67388

release-21.1: opt: add cost penalty for scans with large cardinality #67388

rytaft commented Jul 8, 2021

cockroach-teamcity commented Jul 8, 2021

release-21.1: opt: add cost penalty for scans with large cardinality #67388

release-21.1: opt: add cost penalty for scans with large cardinality #67388

Conversation

rytaft commented Jul 8, 2021

cockroach-teamcity commented Jul 8, 2021