Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected allocations reported in CPU-bound code #1542

Open
ronbrogan opened this issue Sep 30, 2020 · 6 comments · May be fixed by #2562
Open

Unexpected allocations reported in CPU-bound code #1542

ronbrogan opened this issue Sep 30, 2020 · 6 comments · May be fixed by #2562
Assignees

Comments

@ronbrogan
Copy link
Contributor

I'm trying to validate that code doesn't allocate anything and I have some unit tests asserting on the summary provided from running benchmarks.

However, I'm seeing a non-deterministic amount of these test runs fail due to allocations sometimes being reported for a given benchmark - and sometimes not.

My method under test in the real world application takes about 20ms per op (CPU bound, no allocations), so my repro here has a dummy loop to simulate the work.

These benchmarks are all invoking the same method, there are multiple just to illustrate that the same code can yield different allocation results.
image

Source for the repro is here:
https://gist.github.com/ronbrogan/bd53bddd76cfb878eef0ae0a683434df

My only line of reasoning right now is that this is due to the minimum allocation size leaking over into the measured allocations, but I still don't understand why this couldn't be avoided.

In the repro I use GC.GetAllocatedBytesForCurrentThread before/after running my method and there is no difference. Is this an issue with MemoryDiagnoser, user error, or is it simply not reasonable to try to assert that a given benchmark makes 0 allocations?

@adamsitnik
Copy link
Member

Hi @ronbrogan

Big thanks for a very detailed bug report with a simple repro case!

I was able to reproduce it for .NET Core 3.1. 2.1 and 5.0 are free of this bug, I will dig deeper and get back to you

@adamsitnik
Copy link
Member

Ok, this is most probably a side-effect of Tiered JIT which allocates something on the other Thread.

To test it I set the following env var: COMPlus_TieredCompilation:0

@adamsitnik
Copy link
Member

I've confirmed that it's Tiered JIT background thread:

obraz

adamsitnik added a commit that referenced this issue Oct 1, 2020
…iteration, the Tiered JIT might kick-in and allocate some memory and affect the results

as a workaround, we can put the thread to sleep for more than 200ms to TC thread kicks in before we start memory measurements

it's far from perfect but it works

fixes #1542
@Turnerj
Copy link
Contributor

Turnerj commented Oct 28, 2020

I noticed something like this randomly happening in my benchmarks too for a little while, thought it was something weird with my code. Transitioned some allocate-y code to use ArrayPool and it was sometimes allocating a small number of bytes and sometimes not at all - my code is otherwise CPU bound like the OP.

Just quickly jumping through the thread in your PR @adamsitnik , is one potential "quick fix" solution to simply benchmark with the latest .NET 5? (Currently using 3.1 but in my case, can switch to just .NET 5 RC2 easy enough) Edit: Misread your earlier comment, thought you wrote that 3.1, 2.1 and 5.0 all had the bug.

For my own curiousity - why would there be an allocation by the JIT during the diagnoser run? I would have thought the workload and overhead JIT runs would have done everything including any allocations that they may have needed. Is it just that the tiered JIT process can happen outside of the dedicated time that BDN sets for jitting? (I have no knowledge how all that logic is done under-the-hood in BDN so I'm probably missing something obvious)

@Turnerj
Copy link
Contributor

Turnerj commented Oct 28, 2020

Just got around to running my benchmark on .NET 5, it does seem to be allocating still for me.

image

Code base: https://github.com/Turnerj/LevenshteinBenchmarks/tree/2475940db8c4c6f7727c20d5a3ba20a200e77e5c
The specific implementation that shouldn't allocate: https://github.com/Turnerj/LevenshteinBenchmarks/blob/2475940db8c4c6f7727c20d5a3ba20a200e77e5c/Implementations/03_ArrayPool.cs

Just run the "ArrayPool" benchmark to see the results. My use of ArrayPool is well below the 1,048,576 item limit so I don't understand where the allocations are coming from besides something wrong in the diagnoser or the runtime itself.

@timcassell
Copy link
Collaborator

See issue dotnet/runtime#45446

Even though I'm measuring differently there (total bytes instead of allocations), I think the problem is the same. It's affected in both .NET Core 3.1 and .NET 5.0 in my tests (5.0 is worse).

adamsitnik added a commit that referenced this issue Jan 12, 2021
…olchain as it suffers from #1542 (Tiered JIT allocating memory in background)
adamsitnik added a commit that referenced this issue Jan 20, 2021
* use httpS

* update project files to net5.0

* don't use CLASSIC and CORE #if defines

* enable ThreadingDiagnoserTests tests that were disabled so far (APIs not available prior to .NET Core 3)

* update samples

* update remaining tests

* get warnings to 0

* update build scripts

* fix MultipleRuntimesTest (.NET Core is not called Core anymore ;) )

* disable ThreadingDiagnoserTests for the InProcessToolchain

* disable the failing CoreRT tests

* disable some MemoryDiagnoser tests due to #1542
@timcassell timcassell linked a pull request Apr 15, 2024 that will close this issue
@timcassell timcassell assigned timcassell and unassigned adamsitnik Apr 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants