Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance of BinaryHeap::into_sorted_vec vs. Vec::sort_unstable #115357

Open
ghost opened this issue Aug 29, 2023 · 2 comments
Open

Performance of BinaryHeap::into_sorted_vec vs. Vec::sort_unstable #115357

ghost opened this issue Aug 29, 2023 · 2 comments
Labels
C-enhancement Category: An issue proposing an enhancement or a PR with one. C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such I-slow Issue: Problems and improvements with respect to performance of generated code. T-libs Relevant to the library team, which will review and decide on the PR/issue.

Comments

@ghost
Copy link

ghost commented Aug 29, 2023

I think the implementation of BinaryHeap::into_sorted_vec should just use Vec::sort_unstable instead of sorting by repeated Heap operations. For very small N (N < 1000), the current implementation is maybe 5% faster some times, but for bigger N Vec::sort_unstable will be more than 10 to 30 times faster.

Also, I assume sorting by Heap operations requires 2*N*Log2(N) key comparisons in average and worst case when using the default Heapsort like in the standard library, while Vec::sort_unstable probably requires less than 1.5*N*Log2(N) key comparisons in average case and also a lower number of moves in average case.

I don't have good statistics, maybe others could add some.

Here is an example, to show the performance difference:

/*
[dependencies]
fastrand = "2.0.0"
*/

use std::{collections::BinaryHeap, time::Instant};

fn main() {
    let N: usize = 10_000_000;
    let mut a_elements_in_random_order: Vec<usize> = Vec::from_iter(0..N);
    fastrand::shuffle(&mut a_elements_in_random_order);
    let mut heap: BinaryHeap<usize> = BinaryHeap::with_capacity(N);
    for i in 0..N {
        heap.push(a_elements_in_random_order[i]);
    }
    // both Vecs are now identical
    let mut a_sorted_by_sort_unstable = heap.clone().into_vec();
    let now = Instant::now();
    a_sorted_by_sort_unstable.sort_unstable();
    let runtime_sort_unstable = now.elapsed();
    let now = Instant::now();
    let a_sorted_by_heap = heap.into_sorted_vec();
    let runtime_sorted_by_heap = now.elapsed();
    assert!(a_sorted_by_sort_unstable == a_sorted_by_heap);
    println!("runtime_sort_unstable={:?}", runtime_sort_unstable);
    println!("runtime_sorted_by_heap={:?}", runtime_sorted_by_heap);
}
N=10_000_000
runtime_sort_unstable=403.8169ms
runtime_sorted_by_heap=4.6621243s
N=100_000_000
runtime_sort_unstable=3.6869678s
runtime_sorted_by_heap=95.9239879s
@rustbot rustbot added the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Aug 29, 2023
@scottmcm
Copy link
Member

cc #76234 (comment), which had a previous conversation about how into_sorted_vec is kinda weird as an API.

@Noratrieb Noratrieb added I-slow Issue: Problems and improvements with respect to performance of generated code. C-enhancement Category: An issue proposing an enhancement or a PR with one. T-libs Relevant to the library team, which will review and decide on the PR/issue. and removed needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. labels Sep 4, 2023
@finnbear
Copy link
Contributor

finnbear commented Sep 16, 2023

For my WebAssembly client use-case, BinaryHeap::from(vec).into_sorted_vec() generates ~10X smaller code than vec.sort_unstable() while having good asymptotic complexity (unlike the similarly small custom insertion-sort).

Note: slice::heapsort from feature(sort_internals) is an option but that's an internal rustc feature, requires nightly, and has extraneous bounds checks.

@workingjubilee workingjubilee added the C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such label Oct 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-enhancement Category: An issue proposing an enhancement or a PR with one. C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such I-slow Issue: Problems and improvements with respect to performance of generated code. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

5 participants