-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change temporary entrypoints to be lazily allocated #101580
Change temporary entrypoints to be lazily allocated #101580
Conversation
Baseline Loader Heap: ---------------------------------------- System Domain: 7ffab916ec00 LoaderAllocator: 7ffab916ec00 LowFrequencyHeap: Size: 0xf0000 (983040) bytes total. HighFrequencyHeap: Size: 0x16a000 (1482752) bytes total, 0x3000 (12288) bytes wasted. StubHeap: Size: 0x1000 (4096) bytes total. FixupPrecodeHeap: Size: 0x168000 (1474560) bytes total. NewStubPrecodeHeap: Size: 0x18000 (98304) bytes total. IndirectionCellHeap: Size: 0x1000 (4096) bytes total. CacheEntryHeap: Size: 0x1000 (4096) bytes total. Total size: Size: 0x3dd000 (4050944) bytes total, 0x3000 (12288) bytes wasted. Compare Loader Heap: ---------------------------------------- System Domain: 7ff9eb49dc00 LoaderAllocator: 7ff9eb49dc00 LowFrequencyHeap: Size: 0xef000 (978944) bytes total. HighFrequencyHeap: Size: 0x1b2000 (1777664) bytes total, 0x3000 (12288) bytes wasted. StubHeap: Size: 0x1000 (4096) bytes total. FixupPrecodeHeap: Size: 0x70000 (458752) bytes total. NewStubPrecodeHeap: Size: 0x10000 (65536) bytes total. IndirectionCellHeap: Size: 0x1000 (4096) bytes total. CacheEntryHeap: Size: 0x1000 (4096) bytes total. Total size: Size: 0x324000 (3293184) bytes total, 0x3000 (12288) bytes wasted. LowFrequencyHeap is 4KB bigger HighFrequencyHeap is 288KB bigger FixupPrecodeHeap is 992KB smaller NewstubPrecodeHeap is 32KB smaller
…y definition the method is defining the slot
Co-authored-by: Jan Kotas <[email protected]>
Co-authored-by: Jan Kotas <[email protected]>
Try removing EnsureSlotFilled Implement IsEligibleForTieredCompilation in terms of IsEligibleForTieredCompilation_NoCheckMethodDescChunk
@@ -612,7 +612,7 @@ MethodDesc* StubDispatchFrame::GetFunction() | |||
{ | |||
if (m_pRepresentativeMT != NULL) | |||
{ | |||
pMD = m_pRepresentativeMT->GetMethodDescForSlot(m_representativeSlot); | |||
pMD = m_pRepresentativeMT->GetMethodDescForSlot_NoThrow(m_representativeSlot); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My point is if this is not supposed to return a NULL
and we're calling the "NoThrow" then at the very least we need an assert to convey that assumption.
Co-authored-by: Aaron Robinson <[email protected]>
…can return NULL ... and then another thread could produce a stable entrypoint and the assert could lose the race
src/coreclr/vm/method.cpp
Outdated
if (RequiresStableEntryPoint()) | ||
{ | ||
GetOrCreatePrecode()->SetTargetInterlocked(entryPoint); | ||
} | ||
else | ||
{ | ||
SetStableEntryPointInterlocked(entryPoint); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code reads odd. If RequiresStableEntryPoint()
returns false
, then we set the stable entry point? If this is correct, I think a comment is appropriate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sigh. It is right, but yeah, this should be better structured. A RequiredStableEntryPoint is currently required have a stable entrypoint which is a Precode. Otherwise, we can generate set a stable entrypoint which isn't based on a Precode. Before this change, we would have reliably hit the HasPrecode flag above, but with making Precode generation lazy, this is a point where the Precode needs to be forced at this point. However, I think I'll fix this by changing the previous condition to be (RequiresStableEntryPoint()) and not use GetPrecode, and instead use GetOrCreatePrecode(). Then this portion of the condition can revert to be what it was before.
…sn't REQUIRED. we need to handle that case.
- Support iterating methods from the new ISOSDacInterface15 - Re-plumb the existing logic as a local implementation of ISOSDacInterface15 on top of apis that have been present for a long time This is dependent on dotnet/runtime#101580
This change moves temporary entrypoints to be allocated when they are needed instead of eagerly at type load time
As you can see, with a very small test case the wins are significant, and notably, reducing allocations in the stub heaps is nice as they are difficult to allocate efficiently as they require a bunch of memory mapping operations. In addition, with this change I'm removing the concept of Compact Entrypoints, as they are no longer significantly beneficial as of the advent of tiered compilation, and would be difficult to integrate with this change.
What can be seen here, is that the size of the Precode heaps shrinks dramatically, and the size of the HighFrequency heap grows a little bit. With a larger test case (launching powershell with a
-c exit
command line argument, the gains are still very similar. Also with powershell, I have a simple benchmark which runspwsh.exe -c exit
100 times and measures the total execution time. The performance improvement is around 1% which I consider to be quite a nice result.