-
Notifications
You must be signed in to change notification settings - Fork 3.8k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
colexec: adds support for partial ordering in topk sorter
Previously, topKSorter had to process all input rows before returning the top K rows according to its specified ordering. If a subset of the input rows were already ordered, topKSorter would still iterate over the entire input. However, if the input was partially ordered, topKSorter could potentially stop iterating early, since after it has found K candidates it is guaranteed not to find any better top candidates. For example, take the following query and table with an index on a: ``` a | b ----+---- 1 | 5 2 | 3 2 | 1 3 | 3 5 | 3 SELECT * FROM t ORDER BY a, b LIMIT 2 ``` Given an index scan on a to provide `a`'s ordering, topk only needs to process 3 rows in order to guarantee that it has found the top K rows. Once it finishes processing the third row `[2, 1]`, all subsequent rows have higher values of `a` than the top 2 rows found so far, and therefore cannot be in the top 2 rows. This change modifies the vectorized engine's TopKSorter signature to include a partial ordering. The TopKSorter chunks the input according to the sorted columns and processes each chunk with its existing heap algorithm. At the end of each chunk, if K rows are in the heap, TopKSorter emits the rows and stops execution. This change also includes a new microbenchmark, BenchmarkSortTopK, that evaluates TopKSorter with a varying number of partially ordered columns and varying chunk sizes. Release justification: Release note (<category, see below>): <what> <show> <why>
- Loading branch information
1 parent
7c36a9d
commit a58f00e
Showing
6 changed files
with
199 additions
and
51 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.