-
-
Notifications
You must be signed in to change notification settings - Fork 977
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
don't execute long operations more than once per iteration #760
Conversation
…run (hence the method name change), #736
@@ -17,6 +17,9 @@ public class CoreRtTests : BenchmarkTestExecutor | |||
[Fact] | |||
public void CoreRtIsSupported() | |||
{ | |||
if (IntPtr.Size == sizeof(int)) // CoreRT does not support 32bit yet |
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
Currently, I have the following output for a 50ms method:
And the following output for a 5 sec method:
It works which is great! Awesome PR! However, I suggest to improve the output:
|
I also tried to run a 50ms method with the following config: private class Config : ManualConfig
{
public Config()
{
Add(Job.Default.WithUnrollFactor(4));
}
} The output:
If we specifically ask for |
@AndreyAkinshin very good points! I totally missed the case with I have pushed a commit with fixes. With the aligning it definitely looks better:
Was:
|
Now only the |
Also |
I aligned it, now it looks like this:
|
@AndreyAkinshin Tonight I have been running benchmark game benchmarks with this change and I have two more ideas for improvement:
An example:
@AndreyAkinshin What do you think? |
@adamsitnik, I like these ideas! However, we should discuss details and write tests for all corner cases. P.S. The explicit unroll factor still doesn't work for me. Is it already fixed? Could you tell me the name of the corresponded test? |
…ilot if jitting gives the answer, #736
@AndreyAkinshin ok, I just pushed a commit with the changes. I have also cleaned up the code, I hope it's easier to read and maintain now. For: public class Tests
{
[Benchmark] public void Sleep1s() => Thread.Sleep(TimeSpan.FromSeconds(1));
} The output is:
For public class Tests
{
[Benchmark] public void Sleep100ms() => Thread.Sleep(TimeSpan.FromMilliseconds(100));
} The output is:
For public class Tests
{
[Benchmark] public void Sleep10ms() => Thread.Sleep(TimeSpan.FromMilliseconds(10));
} The output is:
For public class Tests
{
[Benchmark] public void Empty() { }
} The output is:
|
The test is ForJobsWithExplicitUnrollFactorTheGlobalSetupIsCalledAndMultiActionCodeGetsJitted The corresponding line of code: if engineParameters.HasUnrollFactor |
DontRunThePilotIfThePilotRequirementIsMetDuringWarmup is failed on Linux. |
@AndreyAkinshin fixed. var unrollFactor = Job.Default.ResolveValue(RunMode.UnrollFactorCharacteristic, DefaultResolver);
var mediumTime = TimeSpan.FromMilliseconds(IterationTime.TotalMilliseconds / unrollFactor);
void MediumMultiple(long _)
{
timesBenchmarkCalled += unrollFactor;
for (int i = 0; i < unrollFactor; i++) // the real unroll factor obviously does not use loop ;)
Thread.Sleep(mediumTime);
} Executing |
I have just run some ML.NET benchmarks. Without this change: 24 minutes. With: 2 minutes ;) |
@AndreyAkinshin I am going to merge it now to avoid conflicts with my incoming PR. If something is missing I will add it. |
@adamsitnik, everything is OK. |
…, fixes dotnet#736 * don't execute long operations more than once per iteration, dotnet#736 * generate the right C# code for dotnet#736 * EngineParameters.Resolver was always null or ignored ;), dotnet#736 * don't forget to JIT idle, dotnet#736 * do the math right for unroll factor for JIT, dotnet#736 * generate the right IL code, dotnet#736 * Setup and Cleanup are jitted together with benchmark, dotnet#736 * engine factory is now supposed to create an engine which is ready to run (hence the method name change), dotnet#736 * addressing PR feedback, dotnet#736 * bring back the calls to DummyActions, dotnet#736 * align iteration mode * don't measure the overhead for time consuming benchmarks, don't run pilot if jitting gives the answer, dotnet#736 * fix the Linux build
@adamsitnik, @AndreyAkinshin, this PR may have caused nano benchmarks to fail. It appears that the calculation mentioned in the OP above doesn't hold in practice. The case referenced in my issue, #1338, has 7ns per operation (when run correctly, when run once, it is 300-600ns per op) and 120ms as |
Background: because of the default unroll factor = 16, every iteration was invoking benchmark at least 16 times.
So if the benchmark took 1s, then:
Note: In the default mode, this was described in our docs and was possible to change with custom config. We just did not have the time to fix it.
This PR introduces following changes:
GlobalSetup
. (hence the method name changeCreate
=>CreateReadyToRun
)IDisposable
, which callsGlobalCleanup
a. If it took longer than IterationTime (0.5s by default), the engine is created for job with one invocation per iteration
b. otherwise, old rules applies
EngineFactory
, I was able to write some nice unit tests (not integration tests)Before: Total time: 00:06:16 (376.01 sec)
After: Total time: 00:00:27 (27.88 sec)
This fixes #736, makes #730 easy and contributes to #550.
@AndreyAkinshin please take a look ;)
@KrzysztofCwalina this fixes the issue you have faced in ML.NET