Op/Invocation count not scaling up #994

Turnerj · 2019-01-01T12:53:05Z

BDN Version: 0.11.3

For a web crawler library I am working on, I am benchmarking my crawler as it crawls a site. The site is a super simple ASP.NET Core site running in-memory via TestServer (Microsoft.AspNetCore.TestHost package).

For some reason, when I do requests requests via a dummy HttpClient that the TestServer makes, the ops/iteration count no longer scales in BDN. To make sure I didn't have something else misconfigured, I tried benchmarking Thread.Sleep(2) and await Task.Delay(2) and they definitely scale up the operations.

I even tried a non-dummy HttpClient to a real website of mine and that also scaled up the operations.

While TestServer obviously has nothing to do with BDN, it seems strange for this to happen - like it somehow interferes with BDN and the invocation count even though the operations are super fast.

I have tried explicitly setting the "invocationCount" when adding the SimpleJob attribute to the class and that does make the "ops" but then ends up doing 100 "jobs" (I don't know the terminology) of 128 ops.

WorkloadActual  98: 128 op, 1329158190.95 ns, 10.3840 ms/op
WorkloadActual  99: 128 op, 1279036729.27 ns, 9.9925 ms/op
WorkloadActual  100: 128 op, 1206203177.80 ns, 9.4235 ms/op

I assume I can probably keep tweaking settings so it doesn't do that but I want to understand why I even need to set these settings when I haven't needed to for other times I have used BDN.

Here is the actual benchmark:

[CoreJob, ClrJob(baseline: true)]
[MemoryDiagnoser]
public class BasicSiteCrawlBenchmark
{
	private TestSiteManager TestSite { get; }
	private Crawler Crawler { get; }
	private Uri Uri { get; } = new Uri("http://localhost/");

	public BasicSiteCrawlBenchmark()
	{
		TestSite = new TestSiteManager(new SiteContext
		{
			SiteFolder = "BasicSite"
		});

		var client = TestSite.GetHttpClient();
		Crawler = new Crawler(client, new ParallelAsyncTaskHandler());
	}

	[Benchmark]
	public async Task CrawlSite()
	{
		var result = await Crawler.Crawl(Uri, new CrawlSettings
		{
			TaskHandlerOptions = new TaskHandlerOptions
			{
				MaxNumberOfSimultaneousTasks = 5,
				DelayBetweenTaskStart = new TimeSpan(),
				DelayJitter = new TimeSpan(),
				TimeoutBeforeThrottle = new TimeSpan()
			}
		});
	}
}

If you want to run it yourself, you can find the full code on this branch: https://github.com/TurnerSoftware/InfinityCrawler/tree/benchmarking

The benchmark will automatically setup the site and have it running etc - no other work required than just running it. Just for clarity, the same site the benchmarking uses is what my CI tests use and it works fine.

Here is a snippet from the log (Core or CLR, it happens in both):

// Benchmark Process Environment Information:
// Runtime=.NET Framework 4.7.2 (CLR 4.0.30319.42000), 64bit RyuJIT-v4.7.3260.0
// GC=Concurrent Workstation
// Job: Clr(Runtime=Clr)

OverheadJitting  1: 1 op, 339752.95 ns, 339.7530 us/op
WorkloadJitting  1: 1 op, 1096102283.02 ns, 1.0961 s/op

WorkloadWarmup   1: 1 op, 3551998.60 ns, 3.5520 ms/op
WorkloadWarmup   2: 1 op, 3171554.30 ns, 3.1716 ms/op
WorkloadWarmup   3: 1 op, 3021035.84 ns, 3.0210 ms/op
WorkloadWarmup   4: 1 op, 3090566.68 ns, 3.0906 ms/op
WorkloadWarmup   5: 1 op, 2730270.53 ns, 2.7303 ms/op
WorkloadWarmup   6: 1 op, 2529579.25 ns, 2.5296 ms/op
WorkloadWarmup   7: 1 op, 2726714.97 ns, 2.7267 ms/op
WorkloadWarmup   8: 1 op, 2274369.47 ns, 2.2744 ms/op

// BeforeActualRun
WorkloadActual   1: 1 op, 2286221.32 ns, 2.2862 ms/op
WorkloadActual   2: 1 op, 2409085.47 ns, 2.4091 ms/op
WorkloadActual   3: 1 op, 2428838.55 ns, 2.4288 ms/op
WorkloadActual   4: 1 op, 2525233.57 ns, 2.5252 ms/op
...

WorkloadActual  97: 1 op, 2940443.28 ns, 2.9404 ms/op
WorkloadActual  98: 1 op, 2491653.34 ns, 2.4917 ms/op
WorkloadActual  99: 1 op, 2121480.64 ns, 2.1215 ms/op
WorkloadActual  100: 1 op, 3007208.69 ns, 3.0072 ms/op

I know this is more along the lines of either debugging my code or another third-party library than BDN but I don't get why the ops aren't going up automatically. How exactly are the invocation counts determined? Could an invoked method interfere with this besides taking a very long time to load (which my benchmarks aren't as you can see above)? Is it some threading/locking condition that BDN may have issues with so doesn't exceed 1 op?

I guess one other important question: Does it matter that the invocation count isn't scaling when benchmarking? (Cause I have no idea)

Any clarity you can provide would be much appreciated!

The text was updated successfully, but these errors were encountered:

Turnerj · 2019-01-01T20:57:02Z

Just to add, you can simplify my benchmark example above by just directly performing a HTTP request on the client from the test server like so:

[Benchmark]
public async Task ExampleBenchmark()
{
	await TestSite.GetHttpClient().GetAsync(new Uri("http://localhost/"));
}

Turnerj · 2019-01-02T07:31:22Z

I've started to investigate this issue in a clean solution and am finding more strange quirks related to the above. I don't believe the issue is library specific anymore but maybe in some strange way the code is getting compiled and analysed by BDN?

In a MVCE, I have been able to reliably create this issue for CLR:

using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;
using Microsoft.AspNetCore.Builder;
using Microsoft.AspNetCore.Hosting;
using Microsoft.AspNetCore.TestHost;
using Microsoft.Extensions.DependencyInjection;
using Microsoft.Extensions.FileProviders;
using System;
using System.IO;
using System.Net.Http;
using System.Threading.Tasks;

namespace TestServerBenchmarkTests
{
	class Program
	{
		static void Main(string[] args)
		{
			BenchmarkRunner.Run<TestBenchmark>();
		}
	}

	[CoreJob, ClrJob(baseline:true)]
	[MemoryDiagnoser]
	public class TestBenchmark
	{
		private TestServer Server { get; }
		private HttpClient Client { get; }

		public TestBenchmark()
		{
			var builder = new WebHostBuilder()
				.UseStartup<EmptyStartup>();
			Server = new TestServer(builder);
			Client = Server.CreateClient();
		}

		//[Benchmark]
		//public void TestServerAndCreateClient()
		//{
		//	var builder = new WebHostBuilder()
		//		.UseStartup<EmptyStartup>();
		//	var testServer = new TestServer(builder);

		//	var client = testServer.CreateClient();
		//}

		[Benchmark]
		public async Task CreatedOnceTestServer()
		{
			await Client.GetAsync("http://localhost");
		}
	}

	class EmptyStartup
	{
		public void ConfigureServices(IServiceCollection services)
		{
			services.AddMvcCore();
		}

		public void Configure(IApplicationBuilder app)
		{
			app.UseStaticFiles(new StaticFileOptions
			{
				FileProvider = new PhysicalFileProvider(Directory.GetCurrentDirectory())
			});

			app.UseMvcWithDefaultRoute();
		}
	}
}

<Project Sdk="Microsoft.NET.Sdk">
  <PropertyGroup>
    <OutputType>Exe</OutputType>
    <TargetFrameworks>netcoreapp2.1;net461</TargetFrameworks>
  </PropertyGroup>
  <ItemGroup>
    <PackageReference Include="BenchmarkDotNet" Version="0.11.3" />
    <PackageReference Include="Microsoft.AspNetCore.Mvc.Core" Version="2.2.0" />
    <PackageReference Include="Microsoft.AspNetCore.StaticFiles" Version="2.2.0" />
    <PackageReference Include="Microsoft.AspNetCore.TestHost" Version="2.2.0" />
  </ItemGroup>
</Project>

With the benchmark "TestServerAndCreateClient" commented out, that second benchmark "CreatedOnceTestServer" will create the issue I described above about not scaling the ops/invocation count automatically. It gets strange because if you uncomment out "TestServerAndCreateClient", the issue disappears!

The two benchmarks do not touch the same instances of the classes yet somehow are impacting each other.

Turnerj · 2019-01-02T07:42:59Z

An even more simple difference than the two benchmarks, just commenting out app.UseMvcWithDefaultRoute(); makes the CLR benchmark behave correctly. I'm going to look into what that code is actually calling internally as I still don't get how that affects BDN the way it is.

ycrumeyrolle · 2019-01-02T21:23:01Z

I have the same issue.
Tried with the configuration Add(Job.Core.WithMinInvokeCount(X)) (and also with .WithMinIterationTime() but without success.

Also tried with [Benchmark(OperationsPerInvoke = X)]. this allow to have more than one iteration, but it is not dynamic.

adamsitnik · 2019-01-02T21:45:19Z

@Turnerj @ycrumeyrolle you are most probably experiencing #837

A workaround would be to call the benchmark once from the [GlobalSetup] method (idea from @exyi)

[GlobalSetup]
public async Task Setup()
{
    TestSite = new TestSiteManager(new SiteContext
    {
        SiteFolder = "BasicSite"
    });

    var client = TestSite.GetHttpClient();
    Crawler = new Crawler(client, new ParallelAsyncTaskHandler());
    
    await CrawlSite(); // benchmark warmup as a workaround for https://github.com/dotnet/BenchmarkDotNet/issues/837
}

Turnerj · 2019-01-03T12:03:23Z

@adamsitnik , yep, that seems to be exactly the problem! That issue you linked to is an interesting read about how it happens with workload jitting etc taking a long time.

I've applied the fix you mentioned though it will be cool to see the change that fixes the core problem land in a later version of BDN.

Thanks!

ycrumeyrolle · 2019-01-03T12:23:29Z

@adamsitnik, same result than @Turnerj.

Before:

// **************************
// Benchmark: JweBenchmark.ValidateJwt: Core(Runtime=Core, Server=True, Toolchain=.NET Core 2.0) [token=JWE-big]
// *** Execute ***
// Launch: 1 / 1
// BeforeAnythingElse

// Benchmark Process Environment Information:
// Runtime=.NET Core 2.2.0 (CoreCLR 4.6.27110.04, CoreFX 4.6.27110.04), 64bit RyuJIT
// GC=Concurrent Server
// Job: Core(Runtime=Core, Server=True)

OverheadJitting  1: 1 op, 345412.51 ns, 345.4125 us/op
WorkloadJitting  1: 1 op, 469646530.50 ns, 469.6465 ms/op

WorkloadWarmup   1: 1 op, 5173292.47 ns, 5.1733 ms/op
WorkloadWarmup   2: 1 op, 4750507.56 ns, 4.7505 ms/op
WorkloadWarmup   3: 1 op, 3906911.53 ns, 3.9069 ms/op
WorkloadWarmup   4: 1 op, 4123238.45 ns, 4.1232 ms/op
WorkloadWarmup   5: 1 op, 4906436.63 ns, 4.9064 ms/op
WorkloadWarmup   6: 1 op, 4354171.38 ns, 4.3542 ms/op
WorkloadWarmup   7: 1 op, 5528573.90 ns, 5.5286 ms/op
WorkloadWarmup   8: 1 op, 3924280.84 ns, 3.9243 ms/op

// BeforeActualRun
WorkloadActual   1: 1 op, 4960123.61 ns, 4.9601 ms/op
WorkloadActual   2: 1 op, 3818091.17 ns, 3.8181 ms/op
WorkloadActual   3: 1 op, 4576814.41 ns, 4.5768 ms/op
WorkloadActual   4: 1 op, 4591025.67 ns, 4.5910 ms/op
WorkloadActual   5: 1 op, 3921912.30 ns, 3.9219 ms/op
...

After:

// **************************
// Benchmark: JweBenchmark.ValidateJwt: Core(Runtime=Core, Server=True, Toolchain=.NET Core 2.0) [token=JWE-big]
// *** Execute ***
// Launch: 1 / 1
// BeforeAnythingElse

// Benchmark Process Environment Information:
// Runtime=.NET Core 2.2.0 (CoreCLR 4.6.27110.04, CoreFX 4.6.27110.04), 64bit RyuJIT
// GC=Concurrent Server
// Job: Core(Runtime=Core, Server=True)

OverheadJitting  1: 1 op, 380545.89 ns, 380.5459 us/op
WorkloadJitting  1: 1 op, 36170808.26 ns, 36.1708 ms/op

WorkloadPilot    1: 2 op, 12390637.47 ns, 6.1953 ms/op
WorkloadPilot    2: 3 op, 13793604.38 ns, 4.5979 ms/op
WorkloadPilot    3: 4 op, 21065425.87 ns, 5.2664 ms/op
WorkloadPilot    4: 5 op, 23668059.76 ns, 4.7336 ms/op
WorkloadPilot    5: 6 op, 32745500.46 ns, 5.4576 ms/op
...

Thanks!

smoogipoo · 2020-01-12T15:51:01Z

Hey all, with the advent of tiered compilation, it seems like the workaround to this doesn't work anymore. Is there any news on a more solid fix?

Turnerj closed this as completed Jan 3, 2019

smoogipoo mentioned this issue Jan 10, 2020

Improve performance/allocations of WeakList ppy/osu-framework#3167

Merged

ycrumeyrolle mentioned this issue Jan 14, 2020

Op/invocation fail to increase due to [IterationCleanup] #1351

Closed

kevingosse mentioned this issue Sep 9, 2020

Make benchmark results more stable DataDog/dd-trace-dotnet#897

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Op/Invocation count not scaling up #994

Op/Invocation count not scaling up #994

Turnerj commented Jan 1, 2019 •

edited

Loading

Turnerj commented Jan 1, 2019 •

edited

Loading

Turnerj commented Jan 2, 2019 •

edited

Loading

Turnerj commented Jan 2, 2019

ycrumeyrolle commented Jan 2, 2019

adamsitnik commented Jan 2, 2019

Turnerj commented Jan 3, 2019

ycrumeyrolle commented Jan 3, 2019

smoogipoo commented Jan 12, 2020

Op/Invocation count not scaling up #994

Op/Invocation count not scaling up #994

Comments

Turnerj commented Jan 1, 2019 • edited Loading

Turnerj commented Jan 1, 2019 • edited Loading

Turnerj commented Jan 2, 2019 • edited Loading

Turnerj commented Jan 2, 2019

ycrumeyrolle commented Jan 2, 2019

adamsitnik commented Jan 2, 2019

Turnerj commented Jan 3, 2019

ycrumeyrolle commented Jan 3, 2019

smoogipoo commented Jan 12, 2020

Turnerj commented Jan 1, 2019 •

edited

Loading

Turnerj commented Jan 1, 2019 •

edited

Loading

Turnerj commented Jan 2, 2019 •

edited

Loading