Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong benchmarks in library/core/benches/char/methods.rs and possibly other places #107590

Closed
safinaskar opened this issue Feb 2, 2023 · 6 comments · Fixed by #107598
Closed
Assignees
Labels
C-bug Category: This is a bug.

Comments

@safinaskar
Copy link
Contributor

safinaskar commented Feb 2, 2023

As I learned from https://gendignoux.com/blog/2022/01/31/rust-benchmarks.html#application-to-real-benchmarks file library/core/benches/char/methods.rs and possibly other files in this repo contain wrong benchmarks.

As author of the article points, call c.to_digit(2) optimizes to None and thus doesn't get benchmarked.

So, please, find all such places and fix them. The faulty example still present in latest master

@safinaskar safinaskar added the C-bug Category: This is a bug. label Feb 2, 2023
@chenyukang
Copy link
Member

@rustbot claim

@gendx
Copy link

gendx commented Feb 2, 2023

Author of https://gendignoux.com/blog/2022/01/31/rust-benchmarks.html here.

Apologies for not having reported a bug here as I didn't have the bandwidth to follow-up - but I'm glad that someone else picked it up! Thanks @safinaskar and @chenyukang 🙏


Beyond this specific example, I think one broader problem (as I mentioned in my conclusion) is that the "box" name misleadingly implies that it "contains" a computation put into it, i.e. that in the following example (close to the Bencher API) the function f will not be optimized (as it's "inside the box").

loop {
    black_box(f());
}

In reality, the "box" is rather a barrier/fence, and only "contains" a value (the result of evaluating f), so the compiler can optimize the computation as follows:

loop {
    let x = f();
    black_box(x);
}

or even:

let x = f();
loop {
    black_box(x);
}

In fact, if the compiler is perfect, we want it to optimize the computation of f outside of the loop (assuming f is a pure function with no side effects) - otherwise there is room to improve the compiler's optimizations.

Back to the Bencher interface, it'd IMO make more sense to think of benchmarking a function f as the transition between f's input(s) and output(s), rather than black_box-ing the result of an input-less function. The API could look like this:

impl Bencher {
    /// Callback for benchmark functions to run in their body.
    pub fn iter<I, O, F>(&mut self, input: I, mut inner: F)
    where
        I: Copy,
        F: FnMut(I) -> O,
    {
        // Omitting the measurements, statistics, etc.
        loop {
            black_box(f(black_box(input)))
        }
    }

On the other hand, the current Bencher interface puts an automatic black_box on the output, but nothing on the input, so any benchmark that doesn't put a manual black_box on the inputs is prone to measure the wrong thing (or hope the compiler won't optimize the function because it exceeded some threshold for inlining optimization).


I don't really have the bandwidth to propose an RFC at the moment, but leaving my thoughts here in case someone wants to pick up this idea.

matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Feb 3, 2023
…r=thomcc

Fix benchmarks in library/core with black_box

Fixes rust-lang#107590
@bors bors closed this as completed in fe84cec Feb 3, 2023
@safinaskar
Copy link
Contributor Author

@chenyukang , @gendx . The fix is completely wrong. Please, reopen the bug.

First, the patch doesn't fix all bugs. For example, the article also mentions file library/core/benches/num/dec2flt/mod.rs , it is not fixed. So, please review all benches in whole repo.

Second, patched files are patched wrong. You need to pass inputs through black_box and also you need to pass outputs to black_box or to Bencher::iter. I. e. both inputs and outputs should be prevented from optimization. Now let's consider library/core/benches/num/flt2dec/strategy/dragon.rs after patch:

format_shortest(black_box(&decoded), &mut buf);
. As you can see, ; prevents result of computation from passing to Bencher::iter. So ; should be removed:

     b.iter(|| {
-        format_shortest(black_box(&decoded), &mut buf);
+        format_shortest(black_box(&decoded), &mut buf)
     });

Also, I think some lint should be added to clippy, which will prevent FnMut() -> () from passing to Bencher::iter

@chenyukang
Copy link
Member

@chenyukang , @gendx . The fix is completely wrong. Please, reopen the bug.

First, the patch doesn't fix all bugs. For example, the article also mentions file library/core/benches/num/dec2flt/mod.rs , it is not fixed. So, please review all benches in whole repo.

Second, patched files are patched wrong. You need to pass inputs through black_box and also you need to pass outputs to black_box or to Bencher::iter. I. e. both inputs and outputs should be prevented from optimization. Now let's consider library/core/benches/num/flt2dec/strategy/dragon.rs after patch:

format_shortest(black_box(&decoded), &mut buf);

. As you can see, ; prevents result of computation from passing to Bencher::iter. So ; should be removed:

     b.iter(|| {
-        format_shortest(black_box(&decoded), &mut buf);
+        format_shortest(black_box(&decoded), &mut buf)
     });

Also, I think some lint should be added to clippy, which will prevent FnMut() -> () from passing to Bencher::iter

Thanks for correcting!
I will submit a follow-up patch for it.

@chenyukang
Copy link
Member

Not all bench mark test should return a value in the block, the change:

     b.iter(|| {
-        format_shortest(black_box(&decoded), &mut buf);
+        format_shortest(black_box(&decoded), &mut buf)
     });

will be rejected by compiler:

  --> library/core/benches/num/flt2dec/strategy/grisu.rs:18:9
   |
16 |     let mut buf = [MaybeUninit::new(0); MAX_SIG_DIGITS];
   |         ------- variable defined here
17 |     b.iter(|| {
   |             - inferred to be a `FnMut` closure
18 |         format_shortest(black_box(&decoded), &mut buf)
   |         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^---^
   |         |                                         |
   |         |                                         variable captured here
   |         returns a reference to a captured variable which escapes the closure body
   |
   = note: `FnMut` closures only have access to their captured variables while they are executing...
   = note: ...therefore, they cannot allow references to captured variables to escape

And this test will write the result to &mut buf, it should not be optimized without writing? 🤔

Another common scenario is we loop in the block of iter, so we will return ():

  b.iter(|| {
       for i in 1..Nums {
           ...blah
       }
  });

Anyway, benchmark testing do need some careful tweaks considering black_box and optimizations in compiler.

@safinaskar
Copy link
Contributor Author

@chenyukang , I propose wrap result in black_box in both cases

matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue May 15, 2023
…r=workingjubilee

Fix more benchmark test with black_box

Follow up fix for rust-lang#107590
RalfJung pushed a commit to RalfJung/miri that referenced this issue May 16, 2023
…jubilee

Fix more benchmark test with black_box

Follow up fix for rust-lang/rust#107590
thomcc pushed a commit to tcdi/postgrestd that referenced this issue Jul 18, 2023
…jubilee

Fix more benchmark test with black_box

Follow up fix for rust-lang/rust#107590
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Category: This is a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants