Trivial methods can cause BDN to run only 1 iteration, skip pilot phase, and ignore most attributes #1337

abelbraaksma · 2019-12-20T01:15:46Z

EDIT 1: see my second comment, I just found a muuuuch easier repro with FSharp that doesn't involve any inref/byref/outref at all, and success is dependent on dummy functions being present.

EDIT 2: in a follow-up issue I explained the likely cause of this being PR #760, that it incorrectly marks micro-tests as RunOncePerIteration.

EDIT 3: This issue is written from an F# standpoint, but the same issue can be exhibited with C# code (that wasn't obvious when I first wrote this).

I've tested the following code with almost any kind of job, different versions of BDN and differents sets of attributes. Changing, for instance, the line sprintf "%A" into sprintf "%s" runs correctly. I checked the IL, but couldn't find reasonable differences.

Copy/paste the following in a new .NET Framework F# project (version of .NET doesn't matter):

namespace BenchMarkBug

open BenchmarkDotNet.Attributes
open BenchmarkDotNet.Jobs
open System.Runtime.InteropServices

[<IterationTime(120.)>]
[<SimpleJob()>]
type BenchmarkPatterns() =
    [<DefaultValue>]
    val mutable testableFunction: (unit -> bool)

    let createOnceAndRun (value: (unit -> bool) outref) =
        if obj.ReferenceEquals(value, null) then
            let foo = sprintf "test %A" "test"

            value <- fun() -> System.Math.Pow(10., 3.) = 1000.

        value()

    [<Benchmark(Baseline = true)>]
    member this.old_Star() = 
        createOnceAndRun &this.testableFunction


module Main =

    [<EntryPoint>]
    let main _ =
        BenchmarkDotNet.Running.BenchmarkRunner.Run<BenchmarkPatterns>()
        |> ignore

        0 // return success

The above code may seem contrived, but I came to it by attempting to get a minimal working example of the observed behavior. Depending on the chosen attributes, it will show erroneous behavior or succeed. When it fails, adding KeepBenchmarkFiles has no effect (they are not kept, or at least I cannot find them), making it harder to diagnose.

I've made the following observations that seem to be of influence:

use of byref, out, outref arguments in the function return value (the example calls the mutable function, but it also shows this behavior if you just return that value, i.e. remove () in value()).
use of a mutable field
F# (I couldn't trivially find a way to get the same problem with C#)
a non-trivial body (in this case, just replace "%A" with "%s" will change the overall behavior entirely)
More-than-one attribute on the class, though some combinations work, some don't. The example doesn't work, but if you remove IterationTime, it works
Versions of .NET tested: 4.6.1 - 4.8
Versions of BDN tested: 0.11.3 - 0.12 (bug exists in all)

A workaround is: assign these values inside a Setup or constructor member. Though you still have to be careful, I came across the "1 iteration per cycle" issue multiple times.

Example of expected output:

// BeforeAnythingElse
// Benchmark Process Environment Information:
// Runtime=.NET Framework 4.7.2 (CLR 4.0.30319.42000), 32bit LegacyJIT-v4.8.3928.0
// GC=Concurrent Workstation
// Job: Job-KABBQW(IterationTime=120.0000 ms)
OverheadJitting  1: 1 op, 260845.09 ns, 260.8451 us/op
WorkloadJitting  1: 1 op, 38071372.39 ns, 38.0714 ms/op
WorkloadPilot    1: 4 op, 4003.53 ns, 1.0009 us/op
WorkloadPilot    2: 119894 op, 7966399.95 ns, 66.4454 ns/op
WorkloadPilot    3: 1805995 op, 122795828.20 ns, 67.9934 ns/op
WorkloadPilot    4: 1764876 op, 120783594.66 ns, 68.4374 ns/op
WorkloadPilot    5: 1753426 op, 118513595.66 ns, 67.5897 ns/op
WorkloadPilot    6: 1775418 op, 121505769.08 ns, 68.4378 ns/op
WorkloadWarmup   1: 1775418 op, 120649014.61 ns, 67.9553 ns/op
WorkloadWarmup   2: 1775418 op, 121769693.80 ns, 68.5865 ns/op
WorkloadWarmup   3: 1775418 op, 120977919.63 ns, 68.1405 ns/op
WorkloadWarmup   4: 1775418 op, 120531988.48 ns, 67.8894 ns/op
WorkloadWarmup   5: 1775418 op, 120127940.36 ns, 67.6618 ns/op
WorkloadWarmup   6: 1775418 op, 120078050.27 ns, 67.6337 ns/op
WorkloadWarmup   7: 1775418 op, 121425390.61 ns, 68.3926 ns/op
WorkloadWarmup   8: 1775418 op, 120180294.15 ns, 67.6913 ns/op
// BeforeActualRun
WorkloadActual   1: 1775418 op, 120329964.42 ns, 67.7756 ns/op
WorkloadActual   2: 1775418 op, 120007526.63 ns, 67.5940 ns/op
WorkloadActual   3: 1775418 op, 119956712.65 ns, 67.5653 ns/op
WorkloadActual   4: 1775418 op, 120961905.53 ns, 68.1315 ns/op
WorkloadActual   5: 1775418 op, 121154690.68 ns, 68.2401 ns/op
WorkloadActual   6: 1775418 op, 119959176.36 ns, 67.5667 ns/op
WorkloadActual   7: 1775418 op, 120334891.83 ns, 67.7783 ns/op
WorkloadActual   8: 1775418 op, 121440788.78 ns, 68.4012 ns/op
WorkloadActual   9: 1775418 op, 120174134.88 ns, 67.6878 ns/op
WorkloadActual  10: 1775418 op, 120092216.59 ns, 67.6417 ns/op
... etc

Example of unexpected output

As can be observed below, the WorkloadPilot phase is skipped, and it consistently stays at 1 op.

// BeforeAnythingElse
// Benchmark Process Environment Information:
// Runtime=.NET Framework 4.7.2 (CLR 4.0.30319.42000), 32bit LegacyJIT-v4.8.3928.0
// GC=Concurrent Workstation
// Job: Job-KABBQW(IterationTime=120.0000 ms)
OverheadJitting  1: 1 op, 303652.02 ns, 303.6520 us/op
WorkloadJitting  1: 1 op, 130028967.05 ns, 130.0290 ms/op
WorkloadWarmup   1: 1 op, 4311.49 ns, 4.3115 us/op
WorkloadWarmup   2: 1 op, 923.89 ns, 923.8905 ns/op
WorkloadWarmup   3: 1 op, 923.89 ns, 923.8905 ns/op
WorkloadWarmup   4: 1 op, 615.93 ns, 615.9270 ns/op
WorkloadWarmup   5: 1 op, 923.89 ns, 923.8905 ns/op
WorkloadWarmup   6: 1 op, 615.93 ns, 615.9270 ns/op
// BeforeActualRun
WorkloadActual   1: 1 op, 615.93 ns, 615.9270 ns/op
WorkloadActual   2: 1 op, 923.89 ns, 923.8905 ns/op
WorkloadActual   3: 1 op, 615.93 ns, 615.9270 ns/op
WorkloadActual   4: 1 op, 923.89 ns, 923.8905 ns/op
WorkloadActual   5: 1 op, 615.93 ns, 615.9270 ns/op
WorkloadActual   6: 1 op, 923.89 ns, 923.8905 ns/op
WorkloadActual   7: 1 op, 923.89 ns, 923.8905 ns/op
WorkloadActual   8: 1 op, 615.93 ns, 615.9270 ns/op
WorkloadActual   9: 1 op, 923.89 ns, 923.8905 ns/op
WorkloadActual  10: 1 op, 923.89 ns, 923.8905 ns/op
WorkloadActual  11: 1 op, 615.93 ns, 615.9270 ns/op
WorkloadActual  12: 1 op, 615.93 ns, 615.9270 ns/op
WorkloadActual  13: 1 op, 923.89 ns, 923.8905 ns/op
WorkloadActual  14: 1 op, 923.89 ns, 923.8905 ns/op
WorkloadActual  15: 1 op, 615.93 ns, 615.9270 ns/op
WorkloadActual  16: 1 op, 1847.78 ns, 1.8478 us/op
WorkloadActual  17: 1 op, 923.89 ns, 923.8905 ns/op
WorkloadActual  18: 1 op, 615.93 ns, 615.9270 ns/op
WorkloadActual  19: 1 op, 923.89 ns, 923.8905 ns/op
WorkloadActual  20: 1 op, 615.93 ns, 615.9270 ns/op
WorkloadActual  21: 1 op, 923.89 ns, 923.8905 ns/op
WorkloadActual  22: 1 op, 923.89 ns, 923.8905 ns/op
WorkloadActual  23: 1 op, 615.93 ns, 615.9270 ns/op
WorkloadActual  24: 1 op, 615.93 ns, 615.9270 ns/op
WorkloadActual  25: 1 op, 615.93 ns, 615.9270 ns/op
WorkloadActual  26: 1 op, 615.93 ns, 615.9270 ns/op
... etc. That is: consistently "1 op"...

I'm currently at my limit of how to further investigate this. The code, when compiled and run by hand and not through BDN works correctly. There's no unsafe involved. I searched the internet and the issues, but had trouble using any reasonable keywords, apologies if this is a known issue.

The text was updated successfully, but these errors were encountered:

abelbraaksma · 2019-12-20T01:32:48Z

Just managed to capture the generated executables by simply breaking the the BDN process with Ctrl-C. I decompiled it using Reflector, dumped the output and compared it byte-for-byte: they are the same.

My initial thought was that it had something to do with the autogenerated code, but it looks now as if the invocation engine has a problem here. Let me know if I can provide more/better information, I'd love to get to the bottom of this as it is a real showstopper at the moment (simply put: in my main project I cannot reliably run BDN as I keep hitting this issue).

abelbraaksma · 2019-12-20T02:35:08Z

Here's a much simpler example of how this can go mind-numbingly wrong:

namespace BenchMarkBug

open BenchmarkDotNet.Attributes
open BenchmarkDotNet.Running

module Test =
    let createOnceAndRun3 =
        let foo = sprintf "test %A" "test"
        fun() -> true

[<IterationTime(120.)>]
[<SimpleJob()>]
type BenchmarkPatterns() =
    let foo =
        let foo = sprintf ""    // remove this line, or remove the function 'foo' to make it work, or fail
        ()

    [<Benchmark>]
    static member old_Star() = 
        Test.createOnceAndRun3()
        |> ignore


module Main =

    [<EntryPoint>]
    let main _ =
        BenchmarkRunner.Run<BenchmarkPatterns>()
        |> ignore

        0 // return success

Observations:

If you remove the, otherwise unused let-binding foo, BDN will fail
If you replace sprintf "" with, say 10 + 32, or basically anything that does not include sprintf, it will fail
Remove foo and change static member to member, still fails (static vs instance has no influence)
There are many other variants that where adding a dummy make it fail, or succeed

This suggests it has something to do with how BDN initializes assemblies, and may skip over the initialization that is required by first module access by F# projects. But I admit, I'm still very much guessing. It is puzzling.

… magic number based on observations of the reported bug) we call it one more time to see if Jitting itself has not dominated the first invocation if it did, it shoud NOT be a single invocation engine (fixes #837, fixes #1337 and fixes #1338)

adamsitnik · 2020-10-24T15:52:14Z

Please excuse me for the delay. I've sent a fix: #1573

…n lasts longer than IterationTime (#1573) * add the test * if the Jitting took more than IterationTime but still less than 1s (a magic number based on observations of the reported bug) we call it one more time to see if Jitting itself has not dominated the first invocation if it did, it shoud NOT be a single invocation engine (fixes #837, fixes #1337 and fixes #1338)

abelbraaksma changed the title ~~Using in/out/by ref depending on body of method causes BDN to run only 1 iteration, skips pilot phase, ignores most attributes~~ Trivial F# functions can cause BDN to run only 1 iteration, skip pilot phase, and ignore most attributes Dec 20, 2019

abelbraaksma mentioned this issue Dec 20, 2019

Issue with "run one iteration for long operations", it's flaky and considers many nano-operations as "long", leading to one op per cycle #1338

Closed

abelbraaksma changed the title ~~Trivial F# functions can cause BDN to run only 1 iteration, skip pilot phase, and ignore most attributes~~ Trivial methods can cause BDN to run only 1 iteration, skip pilot phase, and ignore most attributes Dec 20, 2019

abelbraaksma mentioned this issue Oct 2, 2020

Sometimes Hardware Counters per Op are reported as NaNs #1546

Closed

adamsitnik mentioned this issue Oct 24, 2020

Don't run the benchmark once per iteration if only the first ivocation lasts longer than IterationTime #1573

Merged

AndreyAkinshin closed this as completed in #1573 Oct 26, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Trivial methods can cause BDN to run only 1 iteration, skip pilot phase, and ignore most attributes #1337

Trivial methods can cause BDN to run only 1 iteration, skip pilot phase, and ignore most attributes #1337

abelbraaksma commented Dec 20, 2019 •

edited

Loading

abelbraaksma commented Dec 20, 2019

abelbraaksma commented Dec 20, 2019

adamsitnik commented Oct 24, 2020

Trivial methods can cause BDN to run only 1 iteration, skip pilot phase, and ignore most attributes #1337

Trivial methods can cause BDN to run only 1 iteration, skip pilot phase, and ignore most attributes #1337

Comments

abelbraaksma commented Dec 20, 2019 • edited Loading

Example of expected output:

Example of unexpected output

abelbraaksma commented Dec 20, 2019

abelbraaksma commented Dec 20, 2019

adamsitnik commented Oct 24, 2020

abelbraaksma commented Dec 20, 2019 •

edited

Loading