-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GH-44084: [C++] Improve merge step in chunked sorting #44217
Conversation
Thanks for opening a pull request! If this is not a minor PR. Could you open an issue for this pull request on GitHub? https://github.com/apache/arrow/issues/new/choose Opening GitHub issues ahead of time contributes to the Openness of the Apache Arrow project. Then could you also rename the pull request title in the following format?
or
In the case of PARQUET issues on JIRA the title also supports:
See also: |
@ursabot please benchmark lang=C++ |
Benchmark runs are scheduled for commit a24e70a. Watch https://buildkite.com/apache-arrow and https://conbench.ursa.dev for updates. A comment will be posted here when the runs are complete. |
a24e70a
to
45566ce
Compare
@ursabot please benchmark lang=C++ |
Benchmark runs are scheduled for commit 45566ce. Watch https://buildkite.com/apache-arrow and https://conbench.ursa.dev for updates. A comment will be posted here when the runs are complete. |
Thanks for your patience. Conbench analyzed the 0 benchmarking runs that have been run so far on PR commit a24e70a. None of the specified runs were found on the Conbench server. The full Conbench report has more details. |
Thanks for your patience. Conbench analyzed the 3 benchmarking runs that have been run so far on PR commit 45566ce. There were 21 benchmark results indicating a performance regression:
The full Conbench report has more details. |
@ursabot benchmark help |
Supported benchmark command examples:
To run all benchmarks: To filter benchmarks by language: To filter Python and R benchmarks by name: To filter C++ benchmarks by archery --suite-filter and --benchmark-filter: For other |
674407b
to
562d1f9
Compare
@ursabot please benchmark command=cpp-micro --suite-filter=vector-sort |
Benchmark runs are scheduled for commit df0f691. Watch https://buildkite.com/apache-arrow and https://conbench.ursa.dev for updates. A comment will be posted here when the runs are complete. |
@ursabot please benchmark lang=C++ |
Commit df0f691 already has scheduled benchmark runs. |
df0f691
to
275871b
Compare
@ursabot please benchmark lang=C++ |
Benchmark runs are scheduled for commit 275871b. Watch https://buildkite.com/apache-arrow and https://conbench.ursa.dev for updates. A comment will be posted here when the runs are complete. |
275871b
to
24979f3
Compare
|
Set back to draft because some things can be further improved. |
24979f3
to
f69b3b8
Compare
@ursabot please benchmark lang=C++ |
Benchmark runs are scheduled for commit f69b3b8. Watch https://buildkite.com/apache-arrow and https://conbench.ursa.dev for updates. A comment will be posted here when the runs are complete. |
f69b3b8
to
bdfe17b
Compare
@felipecrv Would you like to give this another look (assuming CI passes, which it should :-))? |
Sorry I will be fully occupied until the end of this week. I'll help review next week. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some questions and nits.
ee712a6
to
13e5cc2
Compare
@github-actions crossbow submit -g cpp |
Revision: 4f2fff4 Submitted crossbow builds: ursacomputing/crossbow @ actions-fa64807be3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
Thanks for the improvement!
After merging your PR, Conbench analyzed the 3 benchmarking runs that have been run so far on merge-commit d5cda4a. There were 132 benchmark results with an error:
There were no benchmark performance regressions. 🎉 The full Conbench report has more details. It also includes information about 5 possible false positives for unstable benchmarks that are known to sometimes produce them. |
It seems like this change causes issues on gcc 8 https://github.com/ursacomputing/crossbow/actions/runs/12081500439/job/33690725970#step:7:2058 probably the change from |
@assignUser Thanks for the heads up, I'll take a look. |
@assignUser See #44898 and #44899 |
Sorry. I've been away. This looks great. Really nice improvements. |
Rationale for this change
When merge-sorting the chunks of a chunked array or table, we would currently repeatedly resolve the chunk indices for each individual value lookup. This requires
O(n*log k)
chunk resolutions withn
being the chunked array or table length, andk
the number of chunks.Instead, this PR translates the logical indices to physical all at once, without even requiring expensive chunk resolution as the logical indices are initially chunk-partitioned.
This change yields significant speedups on chunked array and table sorting:
However, performance also regresses when the input is all-nulls (which is probably rare):
Are these changes tested?
Yes, by existing tests.
Are there any user-facing changes?
No.