Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wildly different results from simply swapping the order of benchmarks #2004

Open
space-alien opened this issue May 11, 2022 · 6 comments
Open

Comments

@space-alien
Copy link

Faced with seemingly inconsistent and inscrutable benchmark results, I followed a hunch and swapped the order of appearance of my two benchmark methods.

The following results are consistently repeatable.

A then B:

|           Method |     Mean |     Error |    StdDev | Ratio |
|----------------- |---------:|----------:|----------:|------:|
|          MethodA | 1.541 ns | 0.0033 ns | 0.0029 ns |  1.00 |
|          MethodB | 3.451 ns | 0.0079 ns | 0.0074 ns |  2.24 |

Swap the order:

|           Method |      Mean |     Error |    StdDev | Ratio | RatioSD |
|----------------- |----------:|----------:|----------:|------:|--------:|
|          MethodB | 4.9731 ns | 0.0951 ns | 0.1057 ns | 21.52 |    0.64 |
|          MethodA | 0.2302 ns | 0.0024 ns | 0.0020 ns |  1.00 |    0.00 |

Nothing has changed between these two runs, aside from swapping the order of the two benchmark methods on the benchmark class.

How should I diagnose what is going on here?

Environment info:
BenchmarkDotNet=v0.13.1
AMD Ryzen 9 5900HX with Radeon Graphics, 1 CPU, 16 logical and 8 physical cores
.NET SDK=6.0.202
  [Host]     : .NET 6.0.4 (6.0.422.16404), X64 RyuJIT
  Job-MPHSCA : .NET 6.0.4 (6.0.422.16404), X64 RyuJIT

Runtime=.NET 6.0

I'm running Windows 11, but the output reports Windows 10, so I have stripped that.

@AndreyAkinshin
Copy link
Member

@space-alien thanks for the report. Could you please share the source code of these benchmarks?

@space-alien
Copy link
Author

Apologies for the delay; I have been investigating further with a view to producing a minimal repro. However, I think I have identified the issue.

My benchmark-holding class has a static member, whose type has a static constructor that performs some non-trivial initialization.

The type's static constructor is being invoked during the benchmarking runs, skewing the results.

This can be solved by taking control of when the static member is initialized, e.g. by adding the following to the benchmark-holding class:

[GlobalSetup]
public void Warmup()
{
    Type myClass = typeof(ClassWithSlowStaticConstructor);
    System.Runtime.CompilerServices.RuntimeHelpers.RunClassConstructor(myClass.TypeHandle);
}

In my case, using a static member on my benchmark class was unnecessary. Not sure whether this kind of scenario might warrant a warning in the docs?

@timcassell
Copy link
Collaborator

Static constructors are only ran once. So only the first warm-up iteration should measure the impact and subsequent iterations should not. Are you measuring a cold start? A regular run should not include static constructors in the results.

@space-alien
Copy link
Author

space-alien commented May 24, 2022

That's what I would have expected, and was confusing me too.

I fiddled around a bit more and I now have a repro, although I may have inadvertently stumbled into a different issue...

A slow static constructor seems to break the automatic selection of number of iterations in the pilot phase. In the code below, a microbenchmark only gets benchmarked over a single op, and the results are of course useless. A delay of >10s seems to be the point at which the pilot phase switches to just a single op for the benchmark runs.

Perhaps the pilot phase could handle the possibility of a slow static constructor when determining the optimal number of iterations to run. For example, by ignoring the time taken by the first invocation, or detecting if subsequent invocations take vastly less time, and increasing the number of ops.

Now... while this is interesting, I'm pretty sure this is NOT the cause of the issue I originally experienced - I would have noticed if my method was only being invoked once, and the static constructor in my original code certainly did not take so long to run. I will continue to investigate.

using BenchmarkDotNet.Attributes;

namespace BenchmarkDotNetRepro2;

[DisassemblyDiagnoser]
public class MicroBenchmark
{
    // Instance member:
    ulong _instanceCount;

    // Static member:
    //   Initialized when first accessed, which occurs before the start of the second benchmark: StaticMember()
    //   The initializer for the static field `_unusedStaticField` gets called at the same time.
    //   Which, in turn, invokes the SlowWarmupClass static constructor.
    static ulong _staticCount;

    // If this field is not static, it is instantiated along with the MicroBenchmark class,
    // and the slow static constructor will not affect benchmarking.
    static readonly SlowWarmupClass _unusedStaticField = new();

    [Benchmark]
    public ulong InstanceMember()
    {
        _instanceCount++;
        return _instanceCount;
    }

    [Benchmark]
    public ulong StaticMember()
    {
        _staticCount++;
        return _staticCount;
    }
}

public class SlowWarmupClass
{
    static SlowWarmupClass()
    {
        Console.WriteLine(" <<<<   SlowWarmup static constructor   >>>> ");
        Thread.Sleep(11000); // Simulate non-trivial initialization
    }
}

@space-alien
Copy link
Author

PS: I updated to 0.13.1.1796 on the BDN nightly feed, and the threshold above which we only get one op appears to have increased to 10s, although the issue still remains. I've updated the repro above (and the following comment above):

Perhaps the pilot phase could handle the possibility of a slow static constructor when determining the optimal number of iterations to run. For example, by ignoring the time taken by the first invocation, or detecting if subsequent invocations take vastly less time, and increasing the number of ops.

@timcassell
Copy link
Collaborator

Seems related to #1210

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants