-
-
Notifications
You must be signed in to change notification settings - Fork 971
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unexpected allocations reported in CPU-bound code #1542
Comments
Hi @ronbrogan Big thanks for a very detailed bug report with a simple repro case! I was able to reproduce it for .NET Core 3.1. 2.1 and 5.0 are free of this bug, I will dig deeper and get back to you |
Ok, this is most probably a side-effect of Tiered JIT which allocates something on the other Thread. To test it I set the following env var: |
…iteration, the Tiered JIT might kick-in and allocate some memory and affect the results as a workaround, we can put the thread to sleep for more than 200ms to TC thread kicks in before we start memory measurements it's far from perfect but it works fixes #1542
I noticed something like this randomly happening in my benchmarks too for a little while, thought it was something weird with my code. Transitioned some allocate-y code to use ArrayPool and it was sometimes allocating a small number of bytes and sometimes not at all - my code is otherwise CPU bound like the OP.
For my own curiousity - why would there be an allocation by the JIT during the diagnoser run? I would have thought the workload and overhead JIT runs would have done everything including any allocations that they may have needed. Is it just that the tiered JIT process can happen outside of the dedicated time that BDN sets for jitting? (I have no knowledge how all that logic is done under-the-hood in BDN so I'm probably missing something obvious) |
Just got around to running my benchmark on .NET 5, it does seem to be allocating still for me. Code base: https://github.com/Turnerj/LevenshteinBenchmarks/tree/2475940db8c4c6f7727c20d5a3ba20a200e77e5c Just run the "ArrayPool" benchmark to see the results. My use of ArrayPool is well below the 1,048,576 item limit so I don't understand where the allocations are coming from besides something wrong in the diagnoser or the runtime itself. |
See issue dotnet/runtime#45446 Even though I'm measuring differently there (total bytes instead of allocations), I think the problem is the same. It's affected in both .NET Core 3.1 and .NET 5.0 in my tests (5.0 is worse). |
…olchain as it suffers from #1542 (Tiered JIT allocating memory in background)
* use httpS * update project files to net5.0 * don't use CLASSIC and CORE #if defines * enable ThreadingDiagnoserTests tests that were disabled so far (APIs not available prior to .NET Core 3) * update samples * update remaining tests * get warnings to 0 * update build scripts * fix MultipleRuntimesTest (.NET Core is not called Core anymore ;) ) * disable ThreadingDiagnoserTests for the InProcessToolchain * disable the failing CoreRT tests * disable some MemoryDiagnoser tests due to #1542
I'm trying to validate that code doesn't allocate anything and I have some unit tests asserting on the summary provided from running benchmarks.
However, I'm seeing a non-deterministic amount of these test runs fail due to allocations sometimes being reported for a given benchmark - and sometimes not.
My method under test in the real world application takes about 20ms per op (CPU bound, no allocations), so my repro here has a dummy loop to simulate the work.
These benchmarks are all invoking the same method, there are multiple just to illustrate that the same code can yield different allocation results.
Source for the repro is here:
https://gist.github.com/ronbrogan/bd53bddd76cfb878eef0ae0a683434df
My only line of reasoning right now is that this is due to the minimum allocation size leaking over into the measured allocations, but I still don't understand why this couldn't be avoided.
In the repro I use
GC.GetAllocatedBytesForCurrentThread
before/after running my method and there is no difference. Is this an issue withMemoryDiagnoser
, user error, or is it simply not reasonable to try to assert that a given benchmark makes 0 allocations?The text was updated successfully, but these errors were encountered: