Weird different benchmark results for code that should be fairly identical #31

vlovich · 2023-11-17T06:41:46Z

I have some benchmarks that looks like this:

use std::mem::MaybeUninit;

fn main() {
    let _ = memcache::CRATE_USED;
    divan::main();
}

fn weird_results_impl(b: divan::Bencher, size: usize) {
    const NUM_ITEMS: usize = 100_000;
    const CAPACITY: usize = NUM_ITEMS;
    let cache = vec![Default::default(); CAPACITY];
    let values = (0..NUM_ITEMS)
        .map(|_| vec![std::mem::MaybeUninit::<u8>::uninit(); size].into_boxed_slice())
        .collect::<Vec<_>>();
    b.counter(divan::counter::ItemsCount::new(NUM_ITEMS))
        .with_inputs(|| {
            (
                cache.clone(),
                values
                    .iter()
                    .enumerate()
                    .map(|(idx, v)| (idx % CAPACITY, v.clone()))
                    .collect::<Vec<_>>(),
            )
        })
        .bench_local_refs(|(cache, refs)| {
            for (entry, mem) in refs {
                cache[*entry] = std::mem::take(mem);
            }
        });
}

#[divan::bench]
fn weird_results_4kib(b: divan::Bencher) {
    weird_results_impl(b, 4 * 1024);
}

#[divan::bench]
fn weird_results_10b(b: divan::Bencher) {
    weird_results_impl(b, 10);
}

There's a fairly large discrepancy between the two

my-crate               fastest       │ slowest       │ median        │ mean          │ samples │ iters
├─ weird_results_4kib  165.4 µs      │ 211.1 µs      │ 173.5 µs      │ 174.8 µs      │ 100     │ 100
│                      604.2 Mitem/s │ 473.5 Mitem/s │ 576.2 Mitem/s │ 571.8 Mitem/s │         │
╰─ weird_results_10b   80.53 µs      │ 110.5 µs      │ 83.22 µs      │ 84.07 µs      │ 100     │ 100
                       1.241 Gitem/s │ 904.2 Mitem/s │ 1.201 Gitem/s │ 1.189 Gitem/s │         │

This was run with mimalloc set as the allocator. AFAICT I'm not dropping any memory within the benchmark loop and the body of the loop shouldn't be doing anything more than shuffling some pointers around (i.e. should be the same amount of shuffling between the two runs I think). Is there something wrong with my benchmark or a bug in divan?

The text was updated successfully, but these errors were encountered:

nvzqz · 2023-11-17T06:58:13Z

I'm not able to reproduce your results when I don't add memcache or mimalloc. I can try again later with those added.

Also, you can use NUM_ITEMS directly since usize implements IntoCounter<Counter = ItemsCount>:

- b.counter(divan::counter::ItemsCount::new(NUM_ITEMS))
+ b.counter(NUM_ITEMS)

vlovich · 2023-11-17T17:26:10Z

memcache is the name of my own crate - can be ignored. Strange that you're not seeing it. mimalloc might be needed to make things more obvious. You must have a faster machine for this benchmark since my 13900 doesn't get that fast with the standard allocator.

OliverKillane · 2024-05-14T23:27:34Z

My suspicion is that:

The benchmarks are varied by the size parameter, which in turn affects the size of the values slices, which makes the clone inside .with_inputs(..) more expensive.

The with_inputs time seems to be caught in the main benchmark time, hence the large difference between the benchmarks.

See #55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Weird different benchmark results for code that should be fairly identical #31

Weird different benchmark results for code that should be fairly identical #31

vlovich commented Nov 17, 2023 •

edited by nvzqz

Loading

nvzqz commented Nov 17, 2023 •

edited

Loading

vlovich commented Nov 17, 2023

OliverKillane commented May 14, 2024

Weird different benchmark results for code that should be fairly identical #31

Weird different benchmark results for code that should be fairly identical #31

Comments

vlovich commented Nov 17, 2023 • edited by nvzqz Loading

nvzqz commented Nov 17, 2023 • edited Loading

vlovich commented Nov 17, 2023

OliverKillane commented May 14, 2024

vlovich commented Nov 17, 2023 •

edited by nvzqz

Loading

nvzqz commented Nov 17, 2023 •

edited

Loading