Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow summarize to aggregate multiple benchmarks into one score #180

Open
abrown opened this issue Jun 15, 2022 · 2 comments
Open

Allow summarize to aggregate multiple benchmarks into one score #180

abrown opened this issue Jun 15, 2022 · 2 comments

Comments

@abrown
Copy link
Collaborator

abrown commented Jun 15, 2022

When measuring more than one benchmark, it would be nice to be able to aggregate the results into a single score. One common way to do this is to take the geometric mean of a set of results. This issue proposes adding a --aggregate-benchmarks flag to do exactly this. When enabled, sightglass-cli summarize --aggregate-benchmarks would emit a single, "aggregated by geomean" result for each of the phases using <all benchmarks>, e.g., for the benchmark column.

@fitzgen
Copy link
Member

fitzgen commented Jun 15, 2022

This was intentionally left out of the original RFC because it seemed like its main use would be to say "wasm engine X scores 95, and so it is better than wasm engine Y which scores 89" which was not the intended use of this benchmark suite.

It also treats all benchmark programs in the corpus as equals, when that isn't really true (I doubt we would accept a 2% regression on spidermonkey.wasm in order to get 3% speed ups in a few of the shoot out programs).

How are you intending on using this single number score?

@jlb6740
Copy link
Collaborator

jlb6740 commented Jun 15, 2022

Hi @fitzgen .. We should discuss offline (show the runner results) but this will be critical for reporting a summary of the performance impact of a patch. As far as I can see, currently we have a way to summarize the comparison of one benchmark at a time but not a way to summarize when running multiple benchmarks. From what I read from what @abrown describes is just an extension of comparing one benchmark at a time. Enabling a summary when running against benchmark-next//.wasm.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants