Results Comparer #165

adamsitnik · 2018-12-04T20:31:22Z

Results Comparer

This simple tool allows for easy comparison of provided benchmark results.

It can be used to compare:

historical results (eg. before and after my changes)
results for different OSes (eg. Windows vs Ubuntu)
results for different CPU architectures (eg. x64 vs ARM64)
results for different target frameworks (eg. .NET Core 2.1 vs 2.2)

All you need to provide is:

--base - path to folder/file with baseline results
--diff - path to folder/file with diff results
--treshold - threshold for Statistical Test. Examples: 5%, 10ms, 100ns, 1s

Optional arguments:

--top - filter the diff to top/bottom N results

Sample: compare the results stored in C:\results\windows vs C:\results\ubuntu using 1% threshold and print only TOP 10.

dotnet run --base "C:\results\windows" --diff "C:\results\ubuntu" --treshold 1% --top 10

Note: the tool supports only *full.json results exported by BenchmarkDotNet. This exporter is enabled by default in this repository.

Note: if you have run your benchmarks for multiple jobs (eg. -r netcoreapp2.1 netcoreapp2.2) and you want to compare these historical results you can use --merge to point to the folder/file with such results.

Reading the results

Sample results:

SLOWER:
-30,35% BenchmarksGame.SpectralNorm_3.RunBench [can have several modes]
-5,45% BenchmarksGame.KNucleotide_9.RunBench [bimodal]
-4,27% BenchmarksGame.BinaryTrees_2.RunBench
-2,42% BenchmarksGame.MandelBrot_7.Bench(size: 4000, lineLength: 500, checksum: "C7-E6-66-43-66-73-F8-A8-D3-B4-D7-97-2F-FC-A1-D3")

FASTER:
3,59% BenchmarksGame.FannkuchRedux_5.RunBench(n: 10, expectedSum: 38)

Explanation:

if there is no difference, the results are omitted
if there is no match (we use full benchmark names to match the benchmarks), the results are omitted
every line contains:
- ratio (1.0 - diffMedian / baseMedian)
- id (full benchmark name, the same we use in BenchView)
- optional information about the modality of the benchmark

brianrob · 2018-12-04T20:35:43Z

@adamsitnik, will take a look. Also, I suspect that @billwert will want to review as well, but he's OOF today.

src/tools/ResultsComparer/CommandLineOptions.cs

AndyAyersMS

This doesn't handle the files I generated recently...

Unhandled Exception: Newtonsoft.Json.JsonSerializationException: Error converting value {null} to type 'System.Int32'. Path 'HostEnvironmentInfo.PhysicalProcessorCount', line 8, position 35. ---> System.InvalidCastException: Null object cannot be converted to a value type.
   at System.Convert.ChangeType(Object value, Type conversionType, IFormatProvider provider)
   at Newtonsoft.Json.Serialization.JsonSerializerInternalReader.EnsureType(JsonReader reader, Object value, CultureInfo culture, JsonContract contract, Type targetType) in /_/Src/Newtonsoft.Json/Serialization/JsonSerializerInternalReader.cs:line 982

not sure if this because I was using preliminary arm64 bits or something else. But it would be nice to catch this sort of error and indicate which file is problematic.

AndyAyersMS · 2018-12-04T21:02:28Z

Looks like my arm64 files have some null entries:

{
   "Title":"BenchmarksGame.BinaryTrees_2",
   "HostEnvironmentInfo":{
      "BenchmarkDotNetCaption":"BenchmarkDotNet",
      "BenchmarkDotNetVersion":"0.11.3.886-nightly",
      "OsVersion":"ubuntu 16.04",
      "ProcessorName":"Unknown processor",
      "PhysicalProcessorCount":null,
      "PhysicalCoreCount":null,
      "LogicalCoreCount":null,
      "RuntimeVersion":".NET Core 3.0.0-preview-27122-01 (CoreCLR 4.6.27121.03, CoreFX 4.7.18.57103)",
      "Architecture":"64bit",
      "HasAttachedDebugger":false,

adamsitnik · 2018-12-04T21:26:40Z

@AndyAyersMS I have updated the code, could you try it one more time?

jorive · 2018-12-04T21:29:33Z

src/tools/ResultsComparer/CommandLineOptions.cs

+    public class CommandLineOptions
+    {
+        [Option("base", HelpText = "Path to the folder/file with base results.")]
+        public string BasePath { get; set; }


You should perform input validation here. For example,

{ get => _base; set { if (string.IsNullOrWhiteSpace(value)) throw new ArgumentException("some message"); if (!Directory.Exists(value)) throw new DirectoryNotFoundException("some message"); // maybe check that the directory has the right files? _base = value; } }

jorive · 2018-12-04T21:30:15Z

src/tools/ResultsComparer/CommandLineOptions.cs

+        [Option("diff", HelpText = "Path to the folder/file with diff results.")]
+        public string DiffPath { get; set; }
+
+        [Option("merged", HelpText = "Path to the folder/file with results merged for multiple jobs in the same file.")]


Is this the output directory? If so, it should be stated in the help string.

I was not sure how to call it (naming..)

So when we run the Benchmarks with --runtimes netcoreapp2.1 netcoreapp2.2 BDN is going to create one json file with the results for both 2.1 and 2.2 inside.

This option allows to compare the perf for such files (results for few jobs are merged into one file)

@jorive I am open to better name suggestions ;p

the name is fine - maybe clearer help text explaining it is for single runs of BDN with many runtimes. "merged" suggests I did something to merge them before hand which isn't the case, right?

jorive · 2018-12-04T21:35:08Z

src/tools/ResultsComparer/Program.cs

+                    .Select(resultFile => JsonConvert.DeserializeObject<BdnResult>(File.ReadAllText(resultFile)))
+                    .SelectMany(result => result.Benchmarks)
+                    .GroupBy(result => result.FullName)
+                        .SelectMany(sameKey => sameKey


nit: remove extra spaces.

src/tools/ResultsComparer/ResultsComparer.sln

AndyAyersMS · 2018-12-04T21:41:31Z

Thanks, it's working on my data now:

SLOWER:
-67880.50% BenchmarksGame.SpectralNorm_3.RunBench [can have several modes]
-8524.79% System.Memory.ReadOnlySpan.IndexOfString(input: "?", value: "?", comparisonType: InvariantCulture)
-2063.05% System.Memory.ReadOnlySpan.GetPinnableReference
-1638.45% System.Memory.ReadOnlySpan.StringAsSpan [bimodal]
-980.13% PerfLabTests.CastingPerf2.CastingPerf.ObjInt
-840.98% PerfLabTests.CastingPerf2.CastingPerf.ObjObjrefValueType
-839.70% PerfLabTests.CastingPerf2.CastingPerf.ObjScalarValueType
-809.60% PerfLabTests.LowLevelPerf.GenericClassWithSTringGenericInstanceMethod
-799.23% Functions.MathTests.PowSingleBenchmark
-768.68% System.Threading.Tests.Perf_Interlocked.Decrement_int

FASTER:
78.23% BenchmarksGame.ReverseComplement_6.RunBench
64.30% BenchmarksGame.MandelBrot_7.Bench(size: 4000, lineLength: 500, checksum: "C7-E6-66-43-66-73-F8-A8-D3-B4-D7-97-2F-FC-A1-D3") [can have several modes]
42.61% Burgers.Test1
41.78% Burgers.Test0
30.05% Benchstone.BenchI.Pi.Test
13.01% System.Memory.ReadOnlySpan.IndexOfString(input: "5555555555", value: "5", comparisonType: InvariantCulture)
11.28% System.Memory.ReadOnlySpan.IndexOfString(input: "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX", value: "x", comparisonType: InvariantCultureIgnoreCase)

Couple of things I'd like to see:

ratios instead of percentages
option to show base and diff values as well
option for csv or other structured output so I can do further analysis on the distribution of ratios

src/tools/ResultsComparer/CommandLineOptions.cs

jorive · 2018-12-04T21:47:25Z

Maybe we should have two more options:
--format output format: md, csv, etc.
--comparison-output [ base/diff | diff/base | percentage] ?

billwert · 2018-12-04T20:56:02Z

src/tools/ResultsComparer/Program.cs

+                if (noiseResult.Conclusion == EquivalenceTestConclusion.Same)
+                    continue;
+
+                var ratio = (1.0 - pair.diffResult.Statistics.Median / pair.baseResult.Statistics.Median);


why ratio instead of percentage delta? I think all of our other reporting tools typically report delta, not ratio.

billwert · 2018-12-04T21:43:55Z

src/tools/ResultsComparer/CommandLineOptions.cs

+        [Option("diff", HelpText = "Path to the folder/file with diff results.")]
+        public string DiffPath { get; set; }
+
+        [Option("merged", HelpText = "Path to the folder/file with results merged for multiple jobs in the same file.")]


the name is fine - maybe clearer help text explaining it is for single runs of BDN with many runtimes. "merged" suggests I did something to merge them before hand which isn't the case, right?

billwert · 2018-12-04T22:01:52Z

@adamsitnik what's the intended use of this tool? Is this designed for devs to just quickly iterate on their own or is this intended to be part of the reporting infrastructure of the performance automation system?

adamsitnik · 2018-12-04T22:05:05Z

Is this designed for devs to just quickly iterate on their own or is this intended to be part of the reporting infrastructure of the performance automation system?

one thing is Windows vs Ubuntu or x64 vs ARM64

the other is: I run the benchmarks before applying any changes and save the results somewhere, then I apply the changes and save them to other location and compare the perf.

adamsitnik · 2018-12-04T22:20:14Z

ratios instead of percentages

@AndyAyersMS would --display ratio|%|delta be OK?

option to show base and diff values as well

ok, then I most probably need to introduce a table

option for csv or other structured output so I can do further analysis on the distribution of ratios

@AndyAyersMS would sth like following simple JSON be enough?

{
    "Benchmarks":[
    {
        "FullName": "BenchmarksGame.BinaryTrees_2.RunBench",
        "Base": [ 1, 2, 3, 4, 5],
        "Diff": [ 1, 2, 3, 4, 5],
        "Conclusion": "Same"
    },
    {
        "FullName": "BenchmarksGame.BinaryTrees_5.RunBench",
        "Base": [ 1, 2, 3, 4, 5],
        "Diff": [ 2, 3, 4, 4, 5],
        "Conclusion": "Slower"
    } ]
}

AndyAyersMS · 2018-12-04T22:54:46Z

would --display ratio|%|delta be OK?

Yes

would sth like following simple JSON be enough?

Ideally something that I can easily import into Excel or R ... I would think CSV would be the simplest. It looks like Excel can import JSON but it wasn't obvious to me how to get what I wanted.

jorive · 2018-12-05T16:59:02Z

@adamsitnik Should we add *.sln to .gitignore?

adamsitnik · 2018-12-05T17:23:46Z

Ideally something that I can easily import into Excel or R ... I would think CSV would be the simplest.

@AndyAyersMS would sth like this be OK?

Conclusion;Id;Values
Base;System.Memory.Span.SomeMethod;0.01;0.02;0.03;
Slower;System.Memory.Span.SomeMethod;0.1;0.2;0.3;

billwert · 2018-12-05T19:44:06Z

one thing is Windows vs Ubuntu or x64 vs ARM64
the other is: I run the benchmarks before applying any changes and save the results somewhere, then I apply the changes and save them to other location and compare the perf.

@adamsitnik sure, makes sense. I'm more curious about whether or not this is something we'd want to incorporate into our automation strategy in general (think reporting), because modularizing it would be nice in that case. We can cross this bridge later however.

One other question: Why is this stand alone and not a mode in Benchmark.NET?

jorive · 2018-12-05T19:59:14Z

One other question: Why is this stand alone and not a mode in Benchmark.NET?

@billwert personally, I think it's better to have tools that build on top of each other and that perform one task very well, instead of a monolithic tool that becomes too bloated and complex to maintain (the linux way some would say?). This way you could think of benchmark/runner/reporter

adamsitnik · 2018-12-05T23:11:03Z

Ok, I have added the export to a table and CSV. The table is GH markdown friendly, can be copy-pasted from the console to GH directly.

To keeps things simple I had also:

removed the --merged option - I am 100% sure I would be the only user of it, it would cause more confusion that good
instead of % I show base/diff for improvements (the bigger the better) and diff/base for regressions (the bigger the worse)

Slower	diff/base	Base Median (ns)	Diff Median (ns)	Modality
PerfLabTests.BlockCopyPerf.CallBlockCopy(numElements: 100)	1.60	9.22	14.76
System.Tests.Perf_String.Trim_CharArr(s: "Test", c: [' ', ' '])	1.41	6.18	8.72

Faster	base/diff	Base Median (ns)	Diff Median (ns)	Modality
System.Tests.Perf_Array.ArrayCopy3D	1.31	372.71	284.73

adamsitnik · 2018-12-05T23:13:41Z

Sample results:

Slower	diff/base	Base Median (ns)	Diff Median (ns)	Modality
Burgers.Test1	11.93	267235588.00	3187167559.50
System.Globalization.Tests.Perf_CompareInfo.IsSortable(text: "Hello Worldbbbbbbbbbbbbbbbbbbbbbbbbbbb	11.63	1115.77	12974.74
System.Globalization.Tests.Perf_CompareInfo.IsSortable(text: "More Test's")	11.24	68.36	768.48
System.Globalization.Tests.Perf_CompareInfo.IsSortable(text: "Exhibit A")	11.10	56.58	628.19
System.Globalization.Tests.Perf_CompareInfo.IsSortable(text: "TestFooBA`RnotsolongTELLme")	11.00	163.97	1804.41
System.Numerics.Tests.Perf_Vector4.DotBenchmark	10.13	1.50	15.16
System.Globalization.Tests.Perf_CompareInfo.IsSortable(text: "foo")	9.86	21.21	209.16
System.Globalization.Tests.Perf_CompareInfo.IsSortable(text: "$")	7.57	9.46	71.62
System.Globalization.Tests.Perf_CompareInfo.IsSortable(text: "?")	7.45	9.45	70.45
Burgers.Test0	7.43	440849666.00	3275757908.00
System.Numerics.Tests.Perf_Vector4.AddFunctionBenchmark	6.21	1.82	11.30
System.Numerics.Tests.Perf_Vector4.AddOperatorBenchmark	6.07	1.82	11.07
FractalPerf.Launch.Test	5.75	174073993.00	1000937680.50
Benchstone.BenchI.Array2.Test	4.80	501337097.00	2405957355.00
Benchstone.BenchI.BubbleSort2.Test	2.94	35372702.64	103894135.00	can have several modes
System.Tests.Perf_Double.ToStringWithFormat(format: "R", number: 1.79769313486232E+308, innerIterati	2.92	372.24	1088.56
System.Tests.Perf_Double.ToStringWithFormat(format: "R", number: -1.79769313486232E+308, innerIterat	2.90	375.55	1088.52
SciMark2.kernel.benchSparseMult	2.68	767957871.00	2054656061.50
Burgers.Test2	2.58	267264201.00	689646694.00
Benchstone.BenchI.IniArray.Test	2.54	112425878.50	285823839.00
Benchstone.BenchF.BenchMrk.Test	2.43	184005617.00	446363534.50
System.Net.Http.Tests.SocketsHttpHandlerPerfTest.Get(ssl: True, chunkedResponse: False, responseLeng	2.41	1620154.00	3910963.00
System.Net.Http.Tests.SocketsHttpHandlerPerfTest.Get(ssl: True, chunkedResponse: True, responseLengt	2.38	1664260.00	3955205.50
SeekUnroll.Test(boxedIndex: 27)	2.32	2416673122.00	5597261928.00
BenchmarksGame.FannkuchRedux_2.RunBench(n: 10, expectedSum: 73196)	2.20	171421112.00	377649577.00
System.Diagnostics.Perf_Process.StartAndKillDotNetVersion	2.19	723188.79	1584806.65
System.Tests.Perf_Array.ArrayCreate1D	2.18	1066.30	2324.38
System.Tests.Perf_Array.ArrayCreate3D	2.17	1191.00	2580.83
Benchstone.BenchF.Lorenz.Test	2.13	273039368.00	581460525.00
SeekUnroll.Test(boxedIndex: 19)	2.11	2368672138.00	5004807345.00
System.Collections.CtorGivenSizeNonGeneric.Hashtable(Size: 512)	2.11	1249.62	2632.27
System.Tests.Perf_Array.ArrayCreate2D	2.08	1192.54	2483.94
SeekUnroll.Test(boxedIndex: 11)	2.04	2453818595.00	5004047922.00
System.Tests.Perf_StringBuilder.StringBuilderAppend	1.94	196651.17	381427.68
BenchmarksGame.Mandelbrot_2.Bench(width: 4000, checksum: "C7-E6-66-43-66-73-F8-A8-D3-B4-D7-97-2F-FC-	1.93	1353477955.00	2609159425.00
Benchstone.BenchF.Romber.Test	1.82	167909839.00	305344466.50
System.Memory.ReadOnlySpan.StringAsSpan	1.71	6.70	11.44

Faster	base/diff	Base Median (ns)	Diff Median (ns)
System.Tests.Perf_String.Contains(size: 1000)	16.01	370.07	23.12
System.Memory.ReadOnlySpan.IndexOfString(input: "Hello Worldbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbareally	8.42	293.70	34.89
System.Globalization.Tests.Perf_CompareInfo.IndexOf(culture: en-US, source: "Hello Worldbbbbbbbbbbbb	7.58	260.58	34.36
XmlDocumentTests.XmlDocumentTests.Perf_XmlDocument.Create	4.15	2446.32	589.30
System.Tests.Perf_String.IndexOf(options: Ordinal)	4.09	15972.11	3901.83
System.Memory.ReadOnlySpan.IndexOfString(input: "Hello Worldbbbbbbbbbbbbbbcbbbbbbbbbbbbbbbbbbba!", v	3.90	108.62	27.82
System.Tests.Perf_Boolean.Parse_str	3.20	56.32	17.61
System.Globalization.Tests.Perf_CompareInfo.IsPrefix(culture: en-US, source: "StrIng", prefix: "str"	3.14	32.82	10.46
System.Tests.Perf_String.Contains(text: "This is a very nice sentence", value: "bad", comparisonType	2.98	45.03	15.12
System.Globalization.Tests.Perf_CompareInfo.IndexOf(culture: es-ES, source: "Hello Worldbbbbbbbbbbbb	2.88	80.92	28.14
System.Globalization.Tests.Perf_CompareInfo.IsPrefix(culture: ja-JP, source: "XXXXXXXXXXXXXXXXXXXXXX	2.72	29.56	10.87
System.Globalization.Tests.Perf_CompareInfo.IsPrefix(culture: , source: "XXXXXXXXXXXXXXXXXXXXXXXXXXX	2.64	29.05	11.02
System.Net.Primitives.Tests.CredentialCacheTests.GetCredential_HostPort(host: "name5", hostPortCount	2.45	230.66	94.31
System.Tests.Perf_String.Compare(strings: ["Thé quick brown fox", "Thé quick BROWN fox"], comparison	2.41	15.87	6.58
System.Tests.Perf_String.GetHashCode(s: "")	2.38	5.57	2.34
System.Net.Primitives.Tests.CredentialCacheTests.GetCredential_HostPort(host: "notfound", hostPortCo	2.36	183.45	77.86
System.Tests.Perf_UInt64.Parse(value: "18446744073709551615")	2.34	159.40	68.21
System.Tests.Perf_String.Compare(strings: ["The quick brown fox", "THE QUICK BROWN FOX"], comparison	2.28	13.43	5.90
System.Tests.Perf_Int64.Parse(value: "9223372036854775807")	2.21	150.47	68.17
System.Tests.Perf_Int64.Parse(value: "-9223372036854775808")	2.13	151.89	71.18
System.Globalization.Tests.Perf_CompareInfo.IsPrefix(culture: , source: "5555555555", prefix: "AAAAA	2.06	16.87	8.20
System.Globalization.Tests.Perf_CompareInfo.IsPrefix(culture: es-ES, source: "??????????????????????	2.05	16.65	8.13
System.Net.Primitives.Tests.IPAddressPerformanceTests.TryFormat(address: 143.24.20.36)	2.02	72.92	36.01
System.Tests.Perf_UInt32.Parse(value: "4294967295")	1.95	109.66	56.11
System.Globalization.Tests.Perf_CompareInfo.IsPrefix(culture: , source: "foobardzsdzs", prefix: "Foo	1.90	50.68	26.73
System.Tests.Perf_String.ToLowerInvariant(s: "test")	1.89	22.71	12.04
System.Tests.Perf_Guid.ctor_str	1.88	328.81	174.58
System.Tests.Perf_String.Contains(size: 100)	1.88	192.41	102.19
System.Tests.Perf_String.GetHashCode(s: "TeSt!")	1.87	6.76	3.61
System.Tests.Perf_Int32.Parse(value: "-2147483648")	1.86	110.69	59.52
System.Globalization.Tests.Perf_CompareInfo.IsPrefix(culture: es-ES, source: "Hello Worldbbbbbbbbbbb	1.84	21.31	11.55
System.Tests.Perf_Int32.Parse(value: "2147483647")	1.83	108.12	59.12
PerfLabTests.LowLevelPerf.GenericGenericMethod	1.81	323704.26	178350.41
System.Tests.Perf_UInt64.Parse(value: "12345")	1.79	91.38	51.08
LinqBenchmarks.Where01LinqQueryX	1.78	692132246.00	388216530.50
System.Tests.Perf_Int64.Parse(value: "12345")	1.78	90.71	50.90
System.Memory.ReadOnlySpan.IndexOfString(input: "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX	1.78	62.91	35.37
LinqBenchmarks.Where01LinqMethodX	1.78	692317830.00	389405790.00
System.Globalization.Tests.Perf_CompareInfo.IsSuffix(culture: es-ES, source: "??????????????????????	1.75	22.64	12.91
System.Memory.ReadOnlySpan.IndexOfString(input: "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX	1.75	40.40	23.05
System.Memory.ReadOnlySpan.IndexOfString(input: "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx	1.75	40.38	23.06
System.Memory.ReadOnlySpan.IndexOfString(input: "???????????????????????????????????????????????????	1.73	39.82	23.06
LinqBenchmarks.Where01LinqMethodNestedX	1.72	777704640.00	452272892.00
PerfLabTests.LowLevelPerf.EmptyStaticFunction5Arg	1.72	3112353.87	1814303.51
System.Tests.Perf_Int32.Parse(value: "12345")	1.71	87.02	51.00
System.Net.Primitives.Tests.IPAddressPerformanceTests.GetAddressBytes(address: 143.24.20.36)	1.70	19.81	11.63
System.Net.Http.Tests.SocketsHttpHandlerPerfTest.Get(ssl: False, chunkedResponse: False, responseLen	1.69	205232.42	121376.53

billwert · 2018-12-05T23:16:13Z

src/tools/ResultsComparer/Program.cs

+            return null;
+        }
+
+        private static double GetRatio(EquivalenceTestConclusion conclusion, Benchmark baseResult, Benchmark diffResult)


I think my comment got lost in the refactor - why are we reporting straight ratios instead of relative percentages? Typically I expect to see this calculation look something like (diff-base)/base (in the case where a higher number is better.) I'd also like to see slower (regressions) represented as negative deltas instead of reversing the calculation as is done here.

I think that base/diff, diff/base, and diff-base)/base is very subjective. For example: your preferences are different than @AndyAyersMS which are also different than @stephentoub ;)

What I have learned in BDN is that users want to customize everything, but it typically adds too much complexity to the code.

For example here to keep everyone happy I would need to introduce a new console argument, add docs for it and handle all cases in sorting the results, formatting them and aligning in the table. I don't have time for it, but I would be happy to review a PR if somebody is willing to implement it.

I don't think it's that subjective. Using straight ratios leads to weird things sometimes - going from 10 to 8 by your method would show as 1.25, but I would think of as a -20% regression from 10. Reversing the terms so that smaller ratios are "worse", always dividing by the base, results in 0.8, which is also a non-obvious way to present the data but closer to something that makes sense in terms of how a data point relates to the previous data point. It gets weird in another way when you invert and higher numbers are worse - imagine a working set measurement going from 1235 pages loaded to 2342 pages. Ratio would tell us it is 1.9, while it is a -47% regression.

For this very specific purpose it may not matter much, but I think there is value in a consistent method for reporting data like this which makes sense across contexts. Given that this tool is not likely to be used for things other than manual investigations we can let it lie, but I'd like to take this back up in a more global sense as we move forward.

Sounds like I will have to polish up my newsletter explaining all the ways ratios are vastly superior to other comparative measures...

More importantly, though: I always prefer to see things reported as diff/base, whether as a percentage or ratio or whatnot, so a single column sort can order things and we can plot distributions without having to do extra math.

@billwert Different developers have different preferences when it comes down to how their data is represented. In the past developers have asked for base/diff, diff/base, diff-base/base, etc. We should just make sure that we are consistent and provide transparent w.r.t. the way output data is presented.

@AndyAyersMS I would like to read that newsletter. (Though I don't quite understand how ratios achieve sorting in ways that deltas do not.)

@jorive

We should just make sure that we are consistent and provide transparent w.r.t. the way output data is presented.

Agreed. I'm trying to identify which way that should go. :)

adamsitnik · 2018-12-13T16:11:22Z

@AndyAyersMS if you don't have any more feature requests could you please accept this PR? I need at least 1 approving review to merge it and all other perf folks are already enjoying their holidays ;)

AndyAyersMS

Thanks!

src/tools/ResultsComparer/DataTransferContracts.cs

Results Comparer

174aafa

adamsitnik requested review from billwert, brianrob, AndyAyersMS and jorive December 4, 2018 20:31

AndyAyersMS reviewed Dec 4, 2018

View reviewed changes

src/tools/ResultsComparer/CommandLineOptions.cs Outdated Show resolved Hide resolved

AndyAyersMS reviewed Dec 4, 2018

View reviewed changes

code review fixes

ee39c1a

jorive reviewed Dec 4, 2018

View reviewed changes

src/tools/ResultsComparer/ResultsComparer.sln Show resolved Hide resolved

jorive reviewed Dec 4, 2018

View reviewed changes

src/tools/ResultsComparer/CommandLineOptions.cs Show resolved Hide resolved

billwert reviewed Dec 4, 2018

View reviewed changes

don't try to parse results for benchmarks which has failed to execute

feb22c8

adamsitnik added 2 commits December 5, 2018 23:07

print nice tables

22d4852

allow the user to specify noise threshold

8b072d7

adamsitnik added 4 commits December 5, 2018 23:28

always use invariant culture

f0ab62b

keep it simple

562b1c3

update docs

3ed0007

allow to export the results to CSV

e2dc48d

billwert reviewed Dec 5, 2018

View reviewed changes

AndyAyersMS approved these changes Dec 13, 2018

View reviewed changes

danmoseley reviewed Dec 13, 2018

View reviewed changes

src/tools/ResultsComparer/DataTransferContracts.cs Show resolved Hide resolved

danmoseley approved these changes Dec 13, 2018

View reviewed changes

adamsitnik mentioned this pull request Dec 14, 2018

2.1 vs 2.2 comparison using performance repo #193

Closed

adamsitnik added 2 commits December 15, 2018 11:51

add missing comment

23433d9

shorten the output

de4fee6

adamsitnik merged commit 29b47a6 into dotnet:master Dec 15, 2018

adamsitnik deleted the versus branch December 15, 2018 10:52

adamsitnik mentioned this pull request Jan 29, 2019

2.1 vs 2.2 #146

Closed

adamsitnik mentioned this pull request Feb 13, 2019

Create a simple tool for comparing results of different benchmark runs #161

Closed

adamsitnik mentioned this pull request Sep 4, 2020

.NET 5.0 Microbenchmarks Performance Study Report dotnet/runtime#41871

Closed

21 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Results Comparer #165

Results Comparer #165

adamsitnik commented Dec 4, 2018

brianrob commented Dec 4, 2018

AndyAyersMS left a comment

AndyAyersMS commented Dec 4, 2018

adamsitnik commented Dec 4, 2018

jorive Dec 4, 2018

jorive Dec 4, 2018

adamsitnik Dec 4, 2018

billwert Dec 4, 2018

jorive Dec 4, 2018

AndyAyersMS commented Dec 4, 2018

jorive commented Dec 4, 2018

billwert Dec 4, 2018

billwert Dec 4, 2018

billwert commented Dec 4, 2018

adamsitnik commented Dec 4, 2018

adamsitnik commented Dec 4, 2018 •

edited

Loading

AndyAyersMS commented Dec 4, 2018 •

edited

Loading

jorive commented Dec 5, 2018

adamsitnik commented Dec 5, 2018

billwert commented Dec 5, 2018

jorive commented Dec 5, 2018

adamsitnik commented Dec 5, 2018

adamsitnik commented Dec 5, 2018

billwert Dec 5, 2018

adamsitnik Dec 5, 2018

billwert Dec 6, 2018

AndyAyersMS Dec 6, 2018

jorive Dec 6, 2018

billwert Dec 6, 2018

adamsitnik commented Dec 13, 2018

AndyAyersMS left a comment

Results Comparer #165

Results Comparer #165

Conversation

adamsitnik commented Dec 4, 2018

Results Comparer

Reading the results

brianrob commented Dec 4, 2018

AndyAyersMS left a comment

Choose a reason for hiding this comment

AndyAyersMS commented Dec 4, 2018

adamsitnik commented Dec 4, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

AndyAyersMS commented Dec 4, 2018

jorive commented Dec 4, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

billwert commented Dec 4, 2018

adamsitnik commented Dec 4, 2018

adamsitnik commented Dec 4, 2018 • edited Loading

AndyAyersMS commented Dec 4, 2018 • edited Loading

jorive commented Dec 5, 2018

adamsitnik commented Dec 5, 2018

billwert commented Dec 5, 2018

jorive commented Dec 5, 2018

adamsitnik commented Dec 5, 2018

adamsitnik commented Dec 5, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

adamsitnik commented Dec 13, 2018

AndyAyersMS left a comment

Choose a reason for hiding this comment

adamsitnik commented Dec 4, 2018 •

edited

Loading

AndyAyersMS commented Dec 4, 2018 •

edited

Loading