-
-
Notifications
You must be signed in to change notification settings - Fork 976
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
built-in accurate and cross platform Memory Diagnoser #284
Conversation
@mattwarren @AndreyAkinshin any comments on the PR? |
@adamsitnik, the source code looks good to me. Give me some time, I will check how it works on Linux. |
// AppDomain. The number is accurate as of the last garbage collection." - CLR via C# | ||
// so we enforce GC.Collect here just to make sure we get accurate results | ||
GC.Collect(); | ||
return AppDomain.CurrentDomain.MonitoringTotalAllocatedMemorySize; |
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
if (results.ContainsKey(benchmark)) | ||
{ | ||
var result = results[benchmark]; | ||
// TODO scale this based on the minimum value in the column, i.e. use B/KB/MB as appropriate |
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
@adamsitnik Finally had a chance to play with this, it's really nice, great job!! I've tested out scenarios similar to #186 and #200 and I agree that they're fixed. I also tested #133 (reported by @xoofx). To do this I ran the
Also as you can see in the outputs below, the # of Run 1
Run 2
Run 3
|
@mattwarren Thanks for the review and feedback! There is one thing that I am not sure how to solve, please consider following scenario: User wants to compare two different methods in terms of CPU and memory. For time and bytes allocated/op we give results that can be compared because they are = sum / operationsCount. However this is not true for Gen 0, 1 & 2 stats. Two benchmarks might be executed different amount of times, but we don't scale the GC collections count. Let's say first benchmark is executed 10k times and has 100 Gen 0, the other 20k times and also has 100 Gen 0. So if the user takes a look at the GC column he or she might think that it has the same GC pressure, but the one executed 10k times is actually two times worse. @mattwarren @AndreyAkinshin We should most probably scale the results. What do you think? How could we call such a column? |
Yes, a total amount for Gen0/Gen1/Gen2 doesn't make sense, because you can't compare them when op count is changing (typically issue #133) and should be converted to a gcCount/op instead. |
…for benchmarks with different # of runs, fixes #133
Ok, I have updated the code. It's very stable now and fixes #133 Sample output:
|
@adamsitnik amazing! Wondering if we should scale Genx/op by 1k op... we will most likely always get a gc after a certain amount of ops and barely for only 1 (through this could happen with a for loop with lots of alloc...) |
Agree, I was going to propose something similar. Scaling GenX to per/op will give pretty small values. I think that 'GenX/1k op' should be okay for most/all scenarios. |
@xoofx @mattwarren Thanks for the ideas! Initially I started with / 1k op:
but then I decided to try the per mille placeholder:
What do you think? Which option is better? Personally I like the |
… + reduce the column's name length (everything is per operation now)
If find the Either way, I think it would be good to add a note in the "Diagnostic Output" section explaining how the calculations are done, something like:
|
Conflicts: samples/BenchmarkDotNet.Samples/Intro/IntroGcMode.cs src/BenchmarkDotNet.Core/Engines/Engine.cs src/BenchmarkDotNet.Core/Engines/RunResults.cs tests/BenchmarkDotNet.IntegrationTests/CustomEngineTests.cs tests/BenchmarkDotNet.IntegrationTests/MemoryDiagnoserTests.cs
private static Func<long> GetAllocatedBytesForCurrentThread() | ||
{ | ||
// for some versions of .NET Core this method is internal, | ||
// for some public and for others public and exposed ;) |
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
Conflicts: tests/BenchmarkDotNet.IntegrationTests/MemoryDiagnoserTests.cs
@AndreyAkinshin @mattwarren I have finished to work on this PR. Could you do a code review? @AndreyAkinshin We have no beta package dependency now, we can release 0.10.1 to nuget.org. Due update to netcoreapp1.1 we need new SDK to be installed, that's why tests on appveyour mail fail until they install it. |
@adamsitnik, nice job! Will do a review on this weekend. |
@adamsitnik, everything looks great! It prints correct results even on Linux. However, I have some additional minor requests:
|
@adamsitnik: additional thoughts about columns/scaling/hints. We have the following problem: it's hard to explain the meaning of some columns with small amount of characters. Maybe each column could provide a |
@AndreyAkinshin thanks for the review! I'll adopt to your comments. I agree that we need some explanation, especially for Gen 0/1/2 columns where the contained values are not what they used to be anymore. I see what I can do. |
Conflicts: src/BenchmarkDotNet.Core/Engines/RunResults.cs src/BenchmarkDotNet.Core/Reports/BenchmarkReport.cs
@AndreyAkinshin I updated the docs and addes smarter way for formatting allocated memory. But I am still not sure about scalling the Gen collection counts. Here the problem is that we have no unit :/ I tried to reproduce the Mono issue and it worked. Could you provide some more details? OS etc
|
Yes, it's a problem. =(
Ok, I will debug it myself.
@mattwarren, thanks for the input. Will think about it. |
Guys, I don't know the best way to do it. Let's merge the PR, release new version, and try to use it. |
@AndreyAkinshin I really like this idea! |
I'll have a play around with the latest version this weekend, but I've been using the code from this branch for the last week or so and it looks good to me! |
⌚️ Should also resolve #301 ? |
@benaadams Yes, exactly |
Ok, it's merged. If everything is fine, I will release |
Okay, I just noticed some interesting behaviour when using public static void TestMonitoringTotalAllocatedMemorySize()
{
AppDomain.MonitoringIsEnabled = true;
// provoke JIT, static ctors etc (was allocating 1740 bytes with first call)
var list = new List<string> { "stringA", "stringB" };
list.Sort();
var temp = new HashSet<string>();
var currentDomain = AppDomain.CurrentDomain;
var hashSetBefore = currentDomain.MonitoringTotalAllocatedMemorySize;
var countersToUse = new [] {1, 5, 10, 25, 50, 100, 250, 500, 1000, 2500, 5000, 10000, 100000, 250000, 500000};
foreach (var counter in countersToUse)
{
var loopCounter = 0;
for (int i = 0; i < counter; i++)
{
var test = new HashSet<string>();
loopCounter += test.Count;
}
Thread.Sleep(10);
var hashSetAfter = currentDomain.MonitoringTotalAllocatedMemorySize;
var totalAlloc = hashSetAfter - hashSetBefore;
Console.WriteLine(
"HashSet<string>() = {0,8:N2} bytes (Total = {1,12:N0} bytes, Counter = {2,8:N0})",
(double)totalAlloc / counter, totalAlloc, counter);
}
} You get the following output:
Note the variance in the amt of
Now I know that we generally do 1000's of iterations, so it might not be a problem, but I just wanted to raise it and see what others thought? For instance in the test above we don't get a consistent value for how many bytes are allocated when you create an empty |
I have encountered similar problem when I was implementing that and found this interesting note: "This instance Int64 property returns the number of bytes that have been allocated by a specific AppDomain. The number is accurate as of the last garbage collection." - CLR via C# @mattwarren Could you try to enforece GC.Collect first? Like we do here |
I tried adding |
@mattwarren sorry for the late response, finally I had some time to try it. Here is my code for both desktop and Core .NET. Results for
Results for
What helps us in the new MemoryDiagnoser is a lot of runs, no extra allocations and the fact that the result number is long, not double. So we cut the minimum overhead when doing |
Fixed isses:
#186 - don't include allocations from setup and cleanup
#200 - be accurate about allocated bytes/op, this was possible mostly due to #277
#208 - without using ETW tests are passing on appveyor now
#133 - scale the results, make them stable
Extras:
Todos:
Recently MS addedDoneGC.GetAllocatedBytesForCurrentThread
to public api surface. We can't use it yet, because this version of Runtime is not yet available at nuget.org. As soon as they release it it should be relatively easy to use it. This will give us bytes allocated per operation for .NET Core as well.When we benchmark Task-returning method we call .GetAwaiter, which most probably allocates memory. This should be verified, if it's true then cost should be excluded from final results.verified, awaiter is struct, no extra allocations ;)