Allow `summarize` to aggregate multiple benchmarks into one score #180

abrown · 2022-06-15T21:04:28Z

When measuring more than one benchmark, it would be nice to be able to aggregate the results into a single score. One common way to do this is to take the geometric mean of a set of results. This issue proposes adding a --aggregate-benchmarks flag to do exactly this. When enabled, sightglass-cli summarize --aggregate-benchmarks would emit a single, "aggregated by geomean" result for each of the phases using <all benchmarks>, e.g., for the benchmark column.

The text was updated successfully, but these errors were encountered:

fitzgen · 2022-06-15T22:07:42Z

This was intentionally left out of the original RFC because it seemed like its main use would be to say "wasm engine X scores 95, and so it is better than wasm engine Y which scores 89" which was not the intended use of this benchmark suite.

It also treats all benchmark programs in the corpus as equals, when that isn't really true (I doubt we would accept a 2% regression on spidermonkey.wasm in order to get 3% speed ups in a few of the shoot out programs).

How are you intending on using this single number score?

jlb6740 · 2022-06-15T22:23:12Z

Hi @fitzgen .. We should discuss offline (show the runner results) but this will be critical for reporting a summary of the performance impact of a patch. As far as I can see, currently we have a way to summarize the comparison of one benchmark at a time but not a way to summarize when running multiple benchmarks. From what I read from what @abrown describes is just an extension of comparing one benchmark at a time. Enabling a summary when running against benchmark-next//.wasm.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow `summarize` to aggregate multiple benchmarks into one score #180

Allow `summarize` to aggregate multiple benchmarks into one score #180

abrown commented Jun 15, 2022

fitzgen commented Jun 15, 2022

jlb6740 commented Jun 15, 2022

Allow summarize to aggregate multiple benchmarks into one score #180

Allow summarize to aggregate multiple benchmarks into one score #180

Comments

abrown commented Jun 15, 2022

fitzgen commented Jun 15, 2022

jlb6740 commented Jun 15, 2022

Allow `summarize` to aggregate multiple benchmarks into one score #180

Allow `summarize` to aggregate multiple benchmarks into one score #180