Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Op/Invocation count not scaling up #994

Closed
Turnerj opened this issue Jan 1, 2019 · 8 comments
Closed

Op/Invocation count not scaling up #994

Turnerj opened this issue Jan 1, 2019 · 8 comments

Comments

@Turnerj
Copy link
Contributor

Turnerj commented Jan 1, 2019

BDN Version: 0.11.3

For a web crawler library I am working on, I am benchmarking my crawler as it crawls a site. The site is a super simple ASP.NET Core site running in-memory via TestServer (Microsoft.AspNetCore.TestHost package).

For some reason, when I do requests requests via a dummy HttpClient that the TestServer makes, the ops/iteration count no longer scales in BDN. To make sure I didn't have something else misconfigured, I tried benchmarking Thread.Sleep(2) and await Task.Delay(2) and they definitely scale up the operations.

I even tried a non-dummy HttpClient to a real website of mine and that also scaled up the operations.

While TestServer obviously has nothing to do with BDN, it seems strange for this to happen - like it somehow interferes with BDN and the invocation count even though the operations are super fast.

I have tried explicitly setting the "invocationCount" when adding the SimpleJob attribute to the class and that does make the "ops" but then ends up doing 100 "jobs" (I don't know the terminology) of 128 ops.

WorkloadActual  98: 128 op, 1329158190.95 ns, 10.3840 ms/op
WorkloadActual  99: 128 op, 1279036729.27 ns, 9.9925 ms/op
WorkloadActual  100: 128 op, 1206203177.80 ns, 9.4235 ms/op

I assume I can probably keep tweaking settings so it doesn't do that but I want to understand why I even need to set these settings when I haven't needed to for other times I have used BDN.

Here is the actual benchmark:

[CoreJob, ClrJob(baseline: true)]
[MemoryDiagnoser]
public class BasicSiteCrawlBenchmark
{
	private TestSiteManager TestSite { get; }
	private Crawler Crawler { get; }
	private Uri Uri { get; } = new Uri("http://localhost/");

	public BasicSiteCrawlBenchmark()
	{
		TestSite = new TestSiteManager(new SiteContext
		{
			SiteFolder = "BasicSite"
		});

		var client = TestSite.GetHttpClient();
		Crawler = new Crawler(client, new ParallelAsyncTaskHandler());
	}

	[Benchmark]
	public async Task CrawlSite()
	{
		var result = await Crawler.Crawl(Uri, new CrawlSettings
		{
			TaskHandlerOptions = new TaskHandlerOptions
			{
				MaxNumberOfSimultaneousTasks = 5,
				DelayBetweenTaskStart = new TimeSpan(),
				DelayJitter = new TimeSpan(),
				TimeoutBeforeThrottle = new TimeSpan()
			}
		});
	}
}

If you want to run it yourself, you can find the full code on this branch: https://github.com/TurnerSoftware/InfinityCrawler/tree/benchmarking

The benchmark will automatically setup the site and have it running etc - no other work required than just running it. Just for clarity, the same site the benchmarking uses is what my CI tests use and it works fine.

Here is a snippet from the log (Core or CLR, it happens in both):

// Benchmark Process Environment Information:
// Runtime=.NET Framework 4.7.2 (CLR 4.0.30319.42000), 64bit RyuJIT-v4.7.3260.0
// GC=Concurrent Workstation
// Job: Clr(Runtime=Clr)

OverheadJitting  1: 1 op, 339752.95 ns, 339.7530 us/op
WorkloadJitting  1: 1 op, 1096102283.02 ns, 1.0961 s/op

WorkloadWarmup   1: 1 op, 3551998.60 ns, 3.5520 ms/op
WorkloadWarmup   2: 1 op, 3171554.30 ns, 3.1716 ms/op
WorkloadWarmup   3: 1 op, 3021035.84 ns, 3.0210 ms/op
WorkloadWarmup   4: 1 op, 3090566.68 ns, 3.0906 ms/op
WorkloadWarmup   5: 1 op, 2730270.53 ns, 2.7303 ms/op
WorkloadWarmup   6: 1 op, 2529579.25 ns, 2.5296 ms/op
WorkloadWarmup   7: 1 op, 2726714.97 ns, 2.7267 ms/op
WorkloadWarmup   8: 1 op, 2274369.47 ns, 2.2744 ms/op

// BeforeActualRun
WorkloadActual   1: 1 op, 2286221.32 ns, 2.2862 ms/op
WorkloadActual   2: 1 op, 2409085.47 ns, 2.4091 ms/op
WorkloadActual   3: 1 op, 2428838.55 ns, 2.4288 ms/op
WorkloadActual   4: 1 op, 2525233.57 ns, 2.5252 ms/op
...

WorkloadActual  97: 1 op, 2940443.28 ns, 2.9404 ms/op
WorkloadActual  98: 1 op, 2491653.34 ns, 2.4917 ms/op
WorkloadActual  99: 1 op, 2121480.64 ns, 2.1215 ms/op
WorkloadActual  100: 1 op, 3007208.69 ns, 3.0072 ms/op

I know this is more along the lines of either debugging my code or another third-party library than BDN but I don't get why the ops aren't going up automatically. How exactly are the invocation counts determined? Could an invoked method interfere with this besides taking a very long time to load (which my benchmarks aren't as you can see above)? Is it some threading/locking condition that BDN may have issues with so doesn't exceed 1 op?

I guess one other important question: Does it matter that the invocation count isn't scaling when benchmarking? (Cause I have no idea)

Any clarity you can provide would be much appreciated!

@Turnerj
Copy link
Contributor Author

Turnerj commented Jan 1, 2019

Just to add, you can simplify my benchmark example above by just directly performing a HTTP request on the client from the test server like so:

[Benchmark]
public async Task ExampleBenchmark()
{
	await TestSite.GetHttpClient().GetAsync(new Uri("http://localhost/"));
}

@Turnerj
Copy link
Contributor Author

Turnerj commented Jan 2, 2019

I've started to investigate this issue in a clean solution and am finding more strange quirks related to the above. I don't believe the issue is library specific anymore but maybe in some strange way the code is getting compiled and analysed by BDN?

In a MVCE, I have been able to reliably create this issue for CLR:

using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;
using Microsoft.AspNetCore.Builder;
using Microsoft.AspNetCore.Hosting;
using Microsoft.AspNetCore.TestHost;
using Microsoft.Extensions.DependencyInjection;
using Microsoft.Extensions.FileProviders;
using System;
using System.IO;
using System.Net.Http;
using System.Threading.Tasks;

namespace TestServerBenchmarkTests
{
	class Program
	{
		static void Main(string[] args)
		{
			BenchmarkRunner.Run<TestBenchmark>();
		}
	}

	[CoreJob, ClrJob(baseline:true)]
	[MemoryDiagnoser]
	public class TestBenchmark
	{
		private TestServer Server { get; }
		private HttpClient Client { get; }

		public TestBenchmark()
		{
			var builder = new WebHostBuilder()
				.UseStartup<EmptyStartup>();
			Server = new TestServer(builder);
			Client = Server.CreateClient();
		}

		//[Benchmark]
		//public void TestServerAndCreateClient()
		//{
		//	var builder = new WebHostBuilder()
		//		.UseStartup<EmptyStartup>();
		//	var testServer = new TestServer(builder);

		//	var client = testServer.CreateClient();
		//}

		[Benchmark]
		public async Task CreatedOnceTestServer()
		{
			await Client.GetAsync("http://localhost");
		}
	}

	class EmptyStartup
	{
		public void ConfigureServices(IServiceCollection services)
		{
			services.AddMvcCore();
		}

		public void Configure(IApplicationBuilder app)
		{
			app.UseStaticFiles(new StaticFileOptions
			{
				FileProvider = new PhysicalFileProvider(Directory.GetCurrentDirectory())
			});

			app.UseMvcWithDefaultRoute();
		}
	}
}
<Project Sdk="Microsoft.NET.Sdk">
  <PropertyGroup>
    <OutputType>Exe</OutputType>
    <TargetFrameworks>netcoreapp2.1;net461</TargetFrameworks>
  </PropertyGroup>
  <ItemGroup>
    <PackageReference Include="BenchmarkDotNet" Version="0.11.3" />
    <PackageReference Include="Microsoft.AspNetCore.Mvc.Core" Version="2.2.0" />
    <PackageReference Include="Microsoft.AspNetCore.StaticFiles" Version="2.2.0" />
    <PackageReference Include="Microsoft.AspNetCore.TestHost" Version="2.2.0" />
  </ItemGroup>
</Project>

With the benchmark "TestServerAndCreateClient" commented out, that second benchmark "CreatedOnceTestServer" will create the issue I described above about not scaling the ops/invocation count automatically. It gets strange because if you uncomment out "TestServerAndCreateClient", the issue disappears!

The two benchmarks do not touch the same instances of the classes yet somehow are impacting each other.

@Turnerj
Copy link
Contributor Author

Turnerj commented Jan 2, 2019

An even more simple difference than the two benchmarks, just commenting out app.UseMvcWithDefaultRoute(); makes the CLR benchmark behave correctly. I'm going to look into what that code is actually calling internally as I still don't get how that affects BDN the way it is.

@ycrumeyrolle
Copy link

I have the same issue.
Tried with the configuration Add(Job.Core.WithMinInvokeCount(X)) (and also with .WithMinIterationTime() but without success.

Also tried with [Benchmark(OperationsPerInvoke = X)]. this allow to have more than one iteration, but it is not dynamic.

@adamsitnik
Copy link
Member

@Turnerj @ycrumeyrolle you are most probably experiencing #837

A workaround would be to call the benchmark once from the [GlobalSetup] method (idea from @exyi)

[GlobalSetup]
public async Task Setup()
{
    TestSite = new TestSiteManager(new SiteContext
    {
        SiteFolder = "BasicSite"
    });

    var client = TestSite.GetHttpClient();
    Crawler = new Crawler(client, new ParallelAsyncTaskHandler());
    
    await CrawlSite(); // benchmark warmup as a workaround for https://github.com/dotnet/BenchmarkDotNet/issues/837
}

@Turnerj
Copy link
Contributor Author

Turnerj commented Jan 3, 2019

@adamsitnik , yep, that seems to be exactly the problem! That issue you linked to is an interesting read about how it happens with workload jitting etc taking a long time.

I've applied the fix you mentioned though it will be cool to see the change that fixes the core problem land in a later version of BDN.

Thanks!

@Turnerj Turnerj closed this as completed Jan 3, 2019
@ycrumeyrolle
Copy link

@adamsitnik, same result than @Turnerj.

Before:

// **************************
// Benchmark: JweBenchmark.ValidateJwt: Core(Runtime=Core, Server=True, Toolchain=.NET Core 2.0) [token=JWE-big]
// *** Execute ***
// Launch: 1 / 1
// BeforeAnythingElse

// Benchmark Process Environment Information:
// Runtime=.NET Core 2.2.0 (CoreCLR 4.6.27110.04, CoreFX 4.6.27110.04), 64bit RyuJIT
// GC=Concurrent Server
// Job: Core(Runtime=Core, Server=True)

OverheadJitting  1: 1 op, 345412.51 ns, 345.4125 us/op
WorkloadJitting  1: 1 op, 469646530.50 ns, 469.6465 ms/op

WorkloadWarmup   1: 1 op, 5173292.47 ns, 5.1733 ms/op
WorkloadWarmup   2: 1 op, 4750507.56 ns, 4.7505 ms/op
WorkloadWarmup   3: 1 op, 3906911.53 ns, 3.9069 ms/op
WorkloadWarmup   4: 1 op, 4123238.45 ns, 4.1232 ms/op
WorkloadWarmup   5: 1 op, 4906436.63 ns, 4.9064 ms/op
WorkloadWarmup   6: 1 op, 4354171.38 ns, 4.3542 ms/op
WorkloadWarmup   7: 1 op, 5528573.90 ns, 5.5286 ms/op
WorkloadWarmup   8: 1 op, 3924280.84 ns, 3.9243 ms/op

// BeforeActualRun
WorkloadActual   1: 1 op, 4960123.61 ns, 4.9601 ms/op
WorkloadActual   2: 1 op, 3818091.17 ns, 3.8181 ms/op
WorkloadActual   3: 1 op, 4576814.41 ns, 4.5768 ms/op
WorkloadActual   4: 1 op, 4591025.67 ns, 4.5910 ms/op
WorkloadActual   5: 1 op, 3921912.30 ns, 3.9219 ms/op
...

After:

// **************************
// Benchmark: JweBenchmark.ValidateJwt: Core(Runtime=Core, Server=True, Toolchain=.NET Core 2.0) [token=JWE-big]
// *** Execute ***
// Launch: 1 / 1
// BeforeAnythingElse

// Benchmark Process Environment Information:
// Runtime=.NET Core 2.2.0 (CoreCLR 4.6.27110.04, CoreFX 4.6.27110.04), 64bit RyuJIT
// GC=Concurrent Server
// Job: Core(Runtime=Core, Server=True)

OverheadJitting  1: 1 op, 380545.89 ns, 380.5459 us/op
WorkloadJitting  1: 1 op, 36170808.26 ns, 36.1708 ms/op

WorkloadPilot    1: 2 op, 12390637.47 ns, 6.1953 ms/op
WorkloadPilot    2: 3 op, 13793604.38 ns, 4.5979 ms/op
WorkloadPilot    3: 4 op, 21065425.87 ns, 5.2664 ms/op
WorkloadPilot    4: 5 op, 23668059.76 ns, 4.7336 ms/op
WorkloadPilot    5: 6 op, 32745500.46 ns, 5.4576 ms/op
...

Thanks!

@smoogipoo
Copy link

Hey all, with the advent of tiered compilation, it seems like the workaround to this doesn't work anymore. Is there any news on a more solid fix?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants