-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add heapsort fallback in select_nth_unstable
#106997
Conversation
(rustbot has picked a reviewer for you, use r? to override) |
Hey! It looks like you've submitted a new PR for the library teams! If this PR contains changes to any Examples of
|
@@ -831,6 +831,15 @@ fn partition_at_index_loop<'a, T, F>( | |||
) where | |||
F: FnMut(&T, &T) -> bool, | |||
{ | |||
// Limit the amount of iterations and fall back to heapsort, similarly to `slice::sort_unstable`. | |||
// This lowers the worst case running time from O(n^2) to O(n log n). | |||
// FIXME: Investigate whether it would be better to use something like Median of Medians |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wikipedia says that median-of-median is somewhat slow, and not used normally, so I don't know if we'd want to start there, but maybe using it as fallback instead of heapsort. But that's a discussion for another time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, what I meant was to use median of medians as a fallback instead of heapsort to keep the fast average case of quickselect and still ensure O(n) worst case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I think I agree that fallback to median-of-medians would make sense.
And it sounds like it is used in practice in some contexts, see: https://en.wikipedia.org/wiki/Introselect
// This lowers the worst case running time from O(n^2) to O(n log n). | ||
// FIXME: Investigate whether it would be better to use something like Median of Medians | ||
// or Fast Deterministic Selection to guarantee O(n) worst case. | ||
let mut limit = usize::BITS - v.len().leading_zeros(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: if you're doing BITS - ctlz
, I'd generally suggest using ilog2
instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just copied the limit calculation from the sorting code, but I can change it if that's preferred. In that case, I'll also change it for sort I guess.
// If the last partitioning was imbalanced, try breaking patterns in the slice by shuffling | ||
// some elements around. Hopefully we'll choose a better pivot this time. | ||
if !was_balanced { | ||
break_patterns(v); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unsure: This looks like it makes it randomized? Does it need both randomization and the heapsort fallback? Could we just use one or the other?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be honest, I just copied this from the sort implementation. But the fact that sort does need both the pattern breaking and the heapsort fallback makes it seem to me like they're both needed. Alternatively, we could just remove this check, decrement limit
unconditionally and initialize it to 2 * ilog2(len)
; this is what the "baseline" implementation of introsort/introselect does (or at least the pseudocode on Wikipedia does it this way).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I see, this is exactly what sort is doing
rust/library/core/src/slice/sort.rs
Lines 746 to 758 in 273c6c3
// If too many bad pivot choices were made, simply fall back to heapsort in order to | |
// guarantee `O(n * log(n))` worst-case. | |
if limit == 0 { | |
heapsort(v, is_less); | |
return; | |
} | |
// If the last partitioning was imbalanced, try breaking patterns in the slice by shuffling | |
// some elements around. Hopefully we'll choose a better pivot this time. | |
if !was_balanced { | |
break_patterns(v); | |
limit -= 1; | |
} |
Makes sense to do the same thing, I guess.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I think we can actually take this exactly as is -- exactly following what sort_unstable
does makes good sense to me.
I'll r+ once CI passes.
@scottmcm CI passed :) |
Add heapsort fallback in `select_nth_unstable` Addresses rust-lang#102451 and rust-lang#106933. `slice::select_nth_unstable` uses a quick select implementation based on the same pattern defeating quicksort algorithm that `slice::sort_unstable` uses. `slice::sort_unstable` uses a recursion limit and falls back to heapsort if there were too many bad pivot choices, to ensure O(n log n) worst case running time (known as introsort). However, `slice::select_nth_unstable` does not have such a fallback strategy, which leads to it having a worst case running time of O(n²) instead. rust-lang#102451 links to a playground which generates pathological inputs that show this quadratic behavior. On my machine, a randomly generated slice of length `1 << 19` takes ~200µs to calculate its median, whereas a pathological input of the same length takes over 2.5s. This PR adds an iteration limit to `select_nth_unstable`, falling back to heapsort, which ensures an O(n log n) worst case running time (introselect). With this change, there was no noticable slowdown for the random input, but the same pathological input now takes only ~1.2ms. In the future it might be worth implementing something like Median of Medians or Fast Deterministic Selection instead, which guarantee O(n) running time for all possible inputs. I've left this as a `FIXME` for now and only implemented the heapsort fallback to minimize the needed code changes. I still think we should clarify in the `select_nth_unstable` docs that the worst case running time isn't currently O(n) (the original reason that rust-lang#102451 was opened), but I think it's a lot better to be able to guarantee O(n log n) instead of O(n²) for the worst case.
…iaskrgr Rollup of 7 pull requests Successful merges: - rust-lang#106244 (Improve Markdown styling in README) - rust-lang#106747 (Add 'static lifetime suggestion when GAT implied 'static requirement from HRTB) - rust-lang#106873 (dont randomly use `_` to print out const generic arguments) - rust-lang#106992 (Remove unused `#![feature(box_syntax)]` in `alloc`) - rust-lang#106995 (bump failing assembly & codegen tests from LLVM 14 to LLVM 15) - rust-lang#106996 (rustdoc: instead of `.setting-name { width: 100% }`, use default div CSS) - rust-lang#106997 (Add heapsort fallback in `select_nth_unstable`) Failed merges: r? `@ghost` `@rustbot` modify labels: rollup
Add Median of Medians fallback to introselect Fixes rust-lang#102451. This PR is a follow up to rust-lang#106997. It adds a Fast Deterministic Selection implementation as a fallback to the introselect algorithm used by `select_nth_unstable`. This allows it to guarantee O(n) worst case running time, while maintaining good performance in all cases. This would fix rust-lang#102451, which was opened because the `select_nth_unstable` docs falsely claimed that it had O(n) worst case performance, even though it was actually quadratic in the worst case. rust-lang#106997 improved the worst case complexity to O(n log n) by using heapsort as a fallback, and this PR further improves it to O(n) (this would also make rust-lang#106933 unnecessary). It also improves the actual runtime if the fallback gets called: Using a pathological input of size `1 << 19` (see the playground link in rust-lang#102451), calculating the median is roughly 3x faster using fast deterministic selection as a fallback than it is using heapsort. The downside to this is less code reuse between the sorting and selection algorithms, but I don't think it's that bad. The additional algorithms are ~250 LOC with no `unsafe` blocks (I tried using unsafe to avoid bounds checks but it didn't noticeably improve the performance). I also let it fuzz for a while against the current `select_nth_unstable` implementation to ensure correctness, and it seems to still fulfill all the necessary postconditions. cc `@scottmcm` who reviewed rust-lang#106997
Update current implementation comments for `select_nth_unstable` This more accurately reflects the actual implementation, as it hasn't been a simple quickselect since rust-lang#106997. While it does say that the current implementation always runs in O(n), I don't think it should require an FCP as it doesn't guarantee linearity in general and only points out that the current implementation is in fact linear. r? `@Amanieu`
Addresses #102451 and #106933.
slice::select_nth_unstable
uses a quick select implementation based on the same pattern defeating quicksort algorithm thatslice::sort_unstable
uses.slice::sort_unstable
uses a recursion limit and falls back to heapsort if there were too many bad pivot choices, to ensure O(n log n) worst case running time (known as introsort). However,slice::select_nth_unstable
does not have such a fallback strategy, which leads to it having a worst case running time of O(n²) instead. #102451 links to a playground which generates pathological inputs that show this quadratic behavior. On my machine, a randomly generated slice of length1 << 19
takes ~200µs to calculate its median, whereas a pathological input of the same length takes over 2.5s. This PR adds an iteration limit toselect_nth_unstable
, falling back to heapsort, which ensures an O(n log n) worst case running time (introselect). With this change, there was no noticable slowdown for the random input, but the same pathological input now takes only ~1.2ms. In the future it might be worth implementing something like Median of Medians or Fast Deterministic Selection instead, which guarantee O(n) running time for all possible inputs. I've left this as aFIXME
for now and only implemented the heapsort fallback to minimize the needed code changes.I still think we should clarify in the
select_nth_unstable
docs that the worst case running time isn't currently O(n) (the original reason that #102451 was opened), but I think it's a lot better to be able to guarantee O(n log n) instead of O(n²) for the worst case.