Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change temporary entrypoints to be lazily allocated #101580

Merged
merged 59 commits into from
Jul 14, 2024
Merged
Show file tree
Hide file tree
Changes from 46 commits
Commits
Show all changes
59 commits
Select commit Hold shift + click to select a range
ab85b04
WorkingOnIt
davidwrighton Apr 25, 2024
889f8a8
It basically works for a single example.
davidwrighton Apr 25, 2024
693d837
If there isn't a parent methodtable and the slot matches... then it b…
davidwrighton Apr 25, 2024
da774fb
Fix a couple more issues found when running a subset of the coreclr t…
davidwrighton Apr 25, 2024
bb59e29
Get X86 building again
davidwrighton Apr 25, 2024
bfde6de
Attempt to use a consistent api to force slots to be set
davidwrighton May 1, 2024
54b8ab6
Put cache around RequiresStableEntryPoint
davidwrighton May 1, 2024
7a4bd28
Merge branch 'main' of github.com:dotnet/runtime into change_temporar…
davidwrighton May 1, 2024
66f2b39
Fix typo
davidwrighton May 3, 2024
951a655
Fix interop identified issue where we sometime set a non Precode into…
davidwrighton May 6, 2024
45a0b3b
Move ARM and X86 to disable compact entry points
davidwrighton May 6, 2024
4e7f41d
Attempt to fix build breaks
davidwrighton May 7, 2024
f6e2fed
fix typo
davidwrighton May 8, 2024
f9de777
Fix another Musl validation issue
davidwrighton May 9, 2024
730fd7c
More tweaks around NULL handling
davidwrighton May 10, 2024
fb335e7
Hopefully the last NULL issue
davidwrighton May 10, 2024
d31ebbb
Fix more NULL issues
davidwrighton May 13, 2024
4948d77
Merge branch 'main' of github.com:dotnet/runtime into change_temporar…
davidwrighton May 13, 2024
3d29d31
Merge branch 'main' of https://github.com/dotnet/runtime into change_…
davidwrighton May 14, 2024
a4acdb1
Merge branch 'main' of github.com:dotnet/runtime into change_temporar…
davidwrighton May 28, 2024
466cabc
Fixup obvious issues
davidwrighton Jun 4, 2024
084bcd7
Merge branch 'main' of github.com:dotnet/runtime into change_temporar…
davidwrighton Jun 4, 2024
44ccb9d
Fix allocation behavior so we don't free the data too early or too late
davidwrighton Jun 5, 2024
3762500
Fix musl validation issue
davidwrighton Jun 5, 2024
a7b68c3
Fix tiered compilation
davidwrighton Jun 25, 2024
6a772d9
Remove Compact Entrypoint logic
davidwrighton Jun 25, 2024
b2360ad
Add new ISOSDacInterface15 api
davidwrighton Jun 26, 2024
837dc0b
Fix some naming of NoAlloc to a more clear IfExists suffix
davidwrighton Jun 26, 2024
cb70e1d
Remove way in which GetTemporaryEntryPoint behaves differently for DA…
davidwrighton Jun 27, 2024
7f0f614
Attempt to reduce most of the use of EnsureSlotFilled. Untested, but …
davidwrighton Jun 27, 2024
f6a8260
Fix the build before sending to github
davidwrighton Jun 27, 2024
cedb18e
Merge branch 'main' of https://github.com/dotnet/runtime into change_…
davidwrighton Jun 27, 2024
97e8f7e
Fix unix build break, and invalid assert
davidwrighton Jun 27, 2024
abae474
Improve assertion checks to validate that we don't allocate temporary…
davidwrighton Jun 27, 2024
8273274
Remove unused parameters and add contracts
davidwrighton Jun 27, 2024
6999c2b
Update method-descriptor.md
davidwrighton Jun 27, 2024
24fa6c2
Fix musl validation issue
davidwrighton Jun 27, 2024
d696419
Adjust SOS api to be an enumerator
davidwrighton Jun 28, 2024
2715494
Fix assertion issues noted
davidwrighton Jun 28, 2024
47f8043
Remove GetRestoredSlotIfExists
davidwrighton Jun 28, 2024
93184bc
Update src/coreclr/debug/daccess/daccess.cpp
davidwrighton Jun 28, 2024
70836cb
Update docs/design/coreclr/botr/method-descriptor.md
davidwrighton Jun 28, 2024
5bf0fb3
Update src/coreclr/vm/methodtable.inl
davidwrighton Jun 28, 2024
7010052
Update src/coreclr/vm/methodtable.h
davidwrighton Jun 28, 2024
d55c93f
Merge branch 'main' of github.com:dotnet/runtime into change_temporar…
davidwrighton Jun 28, 2024
fb2f987
Fix GetMethodDescForSlot_NoThrow
davidwrighton Jun 28, 2024
c5af140
Fix missing change intended in last commit
davidwrighton Jul 1, 2024
0281127
Fix some more IsPublished memory use issues
davidwrighton Jul 2, 2024
56137a2
Call the right GetSlot method
davidwrighton Jul 2, 2024
98aa3c9
Move another scenario to NoThrow, I think this should clear up our te…
davidwrighton Jul 2, 2024
7cf8acd
Add additional IsPublished check
davidwrighton Jul 2, 2024
efe8f1a
Fix MUSL validation build error and Windows x86 build error
davidwrighton Jul 3, 2024
d7d3948
Address code review feedback
davidwrighton Jul 3, 2024
29571eb
Fix classcompat build
davidwrighton Jul 3, 2024
3ce4e4a
Update src/coreclr/vm/method.cpp
davidwrighton Jul 5, 2024
a4c8803
Remove assert that is invalid because TryGetMulticCallableAddrOfCode …
davidwrighton Jul 9, 2024
99ddbcb
Merge branch 'main' of github.com:dotnet/runtime into change_temporar…
davidwrighton Jul 9, 2024
34e9a75
Final (hopefully) code review tweaks.
davidwrighton Jul 12, 2024
2ed2b9c
Its possible for GetOrCreatePrecode to be called for cases where it i…
davidwrighton Jul 14, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
72 changes: 6 additions & 66 deletions docs/design/coreclr/botr/method-descriptor.md
Original file line number Diff line number Diff line change
Expand Up @@ -85,7 +85,9 @@ DWORD MethodDesc::GetAttrs()
Method Slots
------------

Each MethodDesc has a slot, which contains the entry point of the method. The slot and entry point must exist for all methods, even the ones that never run like abstract methods. There are multiple places in the runtime that depend on the 1:1 mapping between entry points and MethodDescs, making this relationship an invariant.
Each MethodDesc has a slot, which contains the current entry point of the method. The slot must exist for all methods, even the ones that never run like abstract methods. There are multiple places in the runtime that depend on mapping between entry points and MethodDescs.

Each MethodDesc logically has an entry point, but we do not allocate these eagerly at MethodDesc creation time. The invariant is that once the method is identified as a method to run, or is used in virtual overriding, we will allocate the entrypoint.

The slot is either in MethodTable or in MethodDesc itself. The location of the slot is determined by `mdcHasNonVtableSlot` bit on MethodDesc.

Expand Down Expand Up @@ -185,8 +187,6 @@ The target of the temporary entry point is a PreStub, which is a special kind of

The **stable entry point** is either the native code or the precode. The **native code** is either jitted code or code saved in NGen image. It is common to talk about jitted code when we actually mean native code.

Temporary entry points are never saved into NGen images. All entry points in NGen images are stable entry points that are never changed. It is an important optimization that reduced private working set.

![Figure 2](images/methoddesc-fig2.png)

Figure 2 Entry Point State Diagram
Expand All @@ -208,6 +208,7 @@ The methods to get callable entry points from MethodDesc are:

- `MethodDesc::GetSingleCallableAddrOfCode`
- `MethodDesc::GetMultiCallableAddrOfCode`
- `MethodDesc::TryGetMultiCallableAddrOfCode`
- `MethodDesc::GetSingleCallableAddrOfVirtualizedCode`
- `MethodDesc::GetMultiCallableAddrOfVirtualizedCode`

Expand All @@ -220,7 +221,7 @@ The type of precode has to be cheaply computable from the instruction sequence.

**StubPrecode**

StubPrecode is the basic precode type. It loads MethodDesc into a scratch register and then jumps. It must be implemented for precodes to work. It is used as fallback when no other specialized precode type is available.
StubPrecode is the basic precode type. It loads MethodDesc into a scratch register<sup>2</sup> and then jumps. It must be implemented for precodes to work. It is used as fallback when no other specialized precode type is available.

All other precodes types are optional optimizations that the platform specific files turn on via HAS\_XXX\_PRECODE defines.

Expand All @@ -236,7 +237,7 @@ StubPrecode looks like this on x86:

FixupPrecode is used when the final target does not require MethodDesc in scratch register<sup>2</sup>. The FixupPrecode saves a few cycles by avoiding loading MethodDesc into the scratch register.

The most common usage of FixupPrecode is for method fixups in NGen images.
Most stubs used are the more efficient form, we currently can use this form for everything but interop methods when a specialized form of Precode is not required.

The initial state of the FixupPrecode on x86:

Expand All @@ -254,67 +255,6 @@ Once it has been patched to point to final target:

<sup>2</sup> Passing MethodDesc in scratch register is sometimes referred to as **MethodDesc Calling Convention**.

**FixupPrecode chunks**

FixupPrecode chunk is a space efficient representation of multiple FixupPrecodes. It mirrors the idea of MethodDescChunk by hoisting the similar MethodDesc pointers from multiple FixupPrecodes to a shared area.

The FixupPrecode chunk saves space and improves code density of the precodes. The code density improvement from FixupPrecode chunks resulted in 1% - 2% gain in big server scenarios on x64.

The FixupPrecode chunks looks like this on x86:

jmp Target2
pop edi // dummy instruction that marks the type of the precode
db MethodDescChunkIndex
db 2 (PrecodeChunkIndex)

jmp Target1
pop edi
db MethodDescChunkIndex
db 1 (PrecodeChunkIndex)

jmp Target0
pop edi
db MethodDescChunkIndex
db 0 (PrecodeChunkIndex)

dw pMethodDescBase

One FixupPrecode chunk corresponds to one MethodDescChunk. There is no 1:1 mapping between the FixupPrecodes in the chunk and MethodDescs in MethodDescChunk though. Each FixupPrecode has index of the method it belongs to. It allows allocating the FixupPrecode in the chunk only for methods that need it.

**Compact entry points**

Compact entry point is a space efficient implementation of temporary entry points.

Temporary entry points implemented using StubPrecode or FixupPrecode can be patched to point to the actual code. Jitted code can call temporary entry point directly. The temporary entry point can be multicallable entry points in this case.

Compact entry points cannot be patched to point to the actual code. Jitted code cannot call them directly. They are trading off speed for size. Calls to these entry points are indirected via slots in a table (FuncPtrStubs) that are patched to point to the actual entry point eventually. A request for a multicallable entry point allocates a StubPrecode or FixupPrecode on demand in this case.

The raw speed difference is the cost of an indirect call for a compact entry point vs. the cost of one direct call and one direct jump on the given platform. The later used to be faster by a few percent in large server scenario since it can be predicted by the hardware better (2005). It is not always the case on current (2015) hardware.

The compact entry points have been historically implemented on x86 only. Their additional complexity, space vs. speed trade-off and hardware advancements made them unjustified on other platforms.

The compact entry point on x86 looks like this:

entrypoint0:
mov al,0
jmp short Dispatch

entrypoint1:
mov al,1
jmp short Dispatch

entrypoint2:
mov al,2
jmp short Dispatch

Dispatch:
movzx eax,al
shl eax, 3
add eax, pBaseMD
jmp PreStub

The allocation of temporary entry points always tries to pick the smallest temporary entry point from the available choices. For example, a single compact entry point is bigger than a single StubPrecode on x86. The StubPrecode will be preferred over the compact entry point in this case. The allocation of the precode for a stable entry point will try to reuse an allocated temporary entry point precode if one exists of the matching type.

**ThisPtrRetBufPrecode**

ThisPtrRetBufPrecode is used to switch a return buffer and the this pointer for open instance delegates returning valuetypes. It is used to convert the calling convention of MyValueType Bar(Foo x) to the calling convention of MyValueType Foo::Bar().
Expand Down
42 changes: 42 additions & 0 deletions src/coreclr/debug/daccess/daccess.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -3239,6 +3239,10 @@ ClrDataAccess::QueryInterface(THIS_
{
ifaceRet = static_cast<ISOSDacInterface14*>(this);
}
else if (IsEqualIID(interfaceId, __uuidof(ISOSDacInterface15)))
{
ifaceRet = static_cast<ISOSDacInterface15*>(this);
}
else
{
*iface = NULL;
Expand Down Expand Up @@ -8340,6 +8344,44 @@ HRESULT DacMemoryEnumerator::Next(unsigned int count, SOSMemoryRegion regions[],
return i < count ? S_FALSE : S_OK;
}

HRESULT DacMethodTableSlotEnumerator::Skip(unsigned int count)
{
mIteratorIndex += count;
return S_OK;
}

HRESULT DacMethodTableSlotEnumerator::Reset()
{
mIteratorIndex = 0;
return S_OK;
}

HRESULT DacMethodTableSlotEnumerator::GetCount(unsigned int* pCount)
{
if (!pCount)
return E_POINTER;

*pCount = mMethods.GetCount();
return S_OK;
}

HRESULT DacMethodTableSlotEnumerator::Next(unsigned int count, SOSMethodData methods[], unsigned int* pFetched)
{
if (!pFetched)
return E_POINTER;

if (!methods)
return E_POINTER;

unsigned int i = 0;
while (i < count && mIteratorIndex < mMethods.GetCount())
{
methods[i++] = mMethods.Get(mIteratorIndex++);
}

*pFetched = i;
return i < count ? S_FALSE : S_OK;
}

HRESULT DacGCBookkeepingEnumerator::Init()
{
Expand Down
29 changes: 28 additions & 1 deletion src/coreclr/debug/daccess/dacimpl.h
Original file line number Diff line number Diff line change
Expand Up @@ -818,7 +818,8 @@ class ClrDataAccess
public ISOSDacInterface11,
public ISOSDacInterface12,
public ISOSDacInterface13,
public ISOSDacInterface14
public ISOSDacInterface14,
public ISOSDacInterface15
{
public:
ClrDataAccess(ICorDebugDataTarget * pTarget, ICLRDataTarget * pLegacyTarget=0);
Expand Down Expand Up @@ -1223,6 +1224,9 @@ class ClrDataAccess
virtual HRESULT STDMETHODCALLTYPE GetThreadStaticBaseAddress(CLRDATA_ADDRESS methodTable, CLRDATA_ADDRESS thread, CLRDATA_ADDRESS *nonGCStaticsAddress, CLRDATA_ADDRESS *GCStaticsAddress);
virtual HRESULT STDMETHODCALLTYPE GetMethodTableInitializationFlags(CLRDATA_ADDRESS methodTable, MethodTableInitializationFlags *initializationStatus);

// ISOSDacInterface15
virtual HRESULT GetMethodTableSlotEnumerator(CLRDATA_ADDRESS mt, ISOSMethodEnum **enumerator);

//
// ClrDataAccess.
//
Expand Down Expand Up @@ -1988,6 +1992,29 @@ class DacMemoryEnumerator : public DefaultCOMImpl<ISOSMemoryEnum, IID_ISOSMemory
unsigned int mIteratorIndex;
};

class DacMethodTableSlotEnumerator : public DefaultCOMImpl<ISOSMethodEnum, IID_ISOSMethodEnum>
{
public:
DacMethodTableSlotEnumerator() : mIteratorIndex(0)
{
}

virtual ~DacMethodTableSlotEnumerator() {}

HRESULT Init(PTR_MethodTable mTable);

HRESULT STDMETHODCALLTYPE Skip(unsigned int count);
HRESULT STDMETHODCALLTYPE Reset();
HRESULT STDMETHODCALLTYPE GetCount(unsigned int *pCount);
HRESULT STDMETHODCALLTYPE Next(unsigned int count, SOSMethodData methods[], unsigned int *pFetched);

protected:
DacReferenceList<SOSMethodData> mMethods;

private:
unsigned int mIteratorIndex;
};

class DacHandleTableMemoryEnumerator : public DacMemoryEnumerator
{
public:
Expand Down
111 changes: 104 additions & 7 deletions src/coreclr/debug/daccess/request.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -219,11 +219,15 @@ BOOL DacValidateMD(PTR_MethodDesc pMD)

if (retval)
{
MethodDesc *pMDCheck = MethodDesc::GetMethodDescFromStubAddr(pMD->GetTemporaryEntryPoint(), TRUE);

if (PTR_HOST_TO_TADDR(pMD) != PTR_HOST_TO_TADDR(pMDCheck))
PCODE tempEntryPoint = pMD->GetTemporaryEntryPointIfExists();
if (tempEntryPoint != (PCODE)NULL)
{
retval = FALSE;
MethodDesc *pMDCheck = MethodDesc::GetMethodDescFromStubAddr(tempEntryPoint, TRUE);

if (PTR_HOST_TO_TADDR(pMD) != PTR_HOST_TO_TADDR(pMDCheck))
{
retval = FALSE;
}
}
}

Expand Down Expand Up @@ -424,7 +428,11 @@ ClrDataAccess::GetMethodTableSlot(CLRDATA_ADDRESS mt, unsigned int slot, CLRDATA
else if (slot < mTable->GetNumVtableSlots())
{
// Now get the slot:
*value = mTable->GetRestoredSlot(slot);
*value = mTable->GetSlot(slot);
if (*value == 0)
{
hr = S_FALSE;
}
}
else
{
Expand All @@ -435,8 +443,15 @@ ClrDataAccess::GetMethodTableSlot(CLRDATA_ADDRESS mt, unsigned int slot, CLRDATA
MethodDesc * pMD = it.GetMethodDesc();
if (pMD->GetSlot() == slot)
{
*value = pMD->GetMethodEntryPoint();
hr = S_OK;
*value = pMD->GetMethodEntryPointIfExists();
if (*value == 0)
{
hr = S_FALSE;
}
else
{
hr = S_OK;
}
davidwrighton marked this conversation as resolved.
Show resolved Hide resolved
}
}
}
Expand All @@ -445,6 +460,88 @@ ClrDataAccess::GetMethodTableSlot(CLRDATA_ADDRESS mt, unsigned int slot, CLRDATA
return hr;
}

HRESULT
ClrDataAccess::GetMethodTableSlotEnumerator(CLRDATA_ADDRESS mt, ISOSMethodEnum **enumerator)
{
if (mt == 0 || enumerator == NULL)
return E_INVALIDARG;

SOSDacEnter();

PTR_MethodTable mTable = PTR_MethodTable(TO_TADDR(mt));
BOOL bIsFree = FALSE;
if (!DacValidateMethodTable(mTable, bIsFree))
{
hr = E_INVALIDARG;
}
else
{
DacMethodTableSlotEnumerator *methodTableSlotEnumerator = new (nothrow) DacMethodTableSlotEnumerator();
*enumerator = methodTableSlotEnumerator;
if (*enumerator == NULL)
{
hr = E_OUTOFMEMORY;
}
else
{
hr = methodTableSlotEnumerator->Init(mTable);
}
}

SOSDacLeave();
return hr;
}

HRESULT DacMethodTableSlotEnumerator::Init(PTR_MethodTable mTable)
{
unsigned int slot = 0;

SOSMethodData methodData;
WORD numVtableSlots = mTable->GetNumVtableSlots();
while (slot < numVtableSlots)
{
MethodDesc* pMD = mTable->GetMethodDescForSlot_NoThrow(slot);
methodData.MethodDesc = HOST_CDADDR(pMD);
methodData.Entrypoint = mTable->GetSlot(slot);
methodData.DefininingMethodTable = PTR_CDADDR(pMD->GetMethodTable());
methodData.DefiningModule = HOST_CDADDR(pMD->GetModule());
methodData.Token = pMD->GetMemberDef();
davidwrighton marked this conversation as resolved.
Show resolved Hide resolved

methodData.Slot = slot++;

if (!mMethods.Add(methodData))
return E_OUTOFMEMORY;
}

MethodTable::IntroducedMethodIterator it(mTable);
for (; it.IsValid(); it.Next())
{
MethodDesc* pMD = it.GetMethodDesc();
WORD slot = pMD->GetSlot();
if (slot >= numVtableSlots)
{
methodData.MethodDesc = HOST_CDADDR(pMD);
davidwrighton marked this conversation as resolved.
Show resolved Hide resolved
methodData.Entrypoint = pMD->GetMethodEntryPointIfExists();
methodData.DefininingMethodTable = PTR_CDADDR(pMD->GetMethodTable());
methodData.DefiningModule = HOST_CDADDR(pMD->GetModule());
methodData.Token = pMD->GetMemberDef();

if (slot == MethodTable::NO_SLOT)
{
methodData.Slot = 0xFFFFFFFF;
}
else
{
methodData.Slot = slot;
}

if (!mMethods.Add(methodData))
return E_OUTOFMEMORY;
}
}

return S_OK;
}

HRESULT
ClrDataAccess::GetCodeHeapList(CLRDATA_ADDRESS jitManager, unsigned int count, struct DacpJitCodeHeapInfo codeHeaps[], unsigned int *pNeeded)
Expand Down
2 changes: 1 addition & 1 deletion src/coreclr/inc/corinfo.h
Original file line number Diff line number Diff line change
Expand Up @@ -893,7 +893,7 @@ enum CORINFO_ACCESS_FLAGS
{
CORINFO_ACCESS_ANY = 0x0000, // Normal access
CORINFO_ACCESS_THIS = 0x0001, // Accessed via the this reference
// UNUSED = 0x0002,
CORINFO_ACCESS_PREFER_SLOT_OVER_TEMPORARY_ENTRYPOINT = 0x0002, // Prefer access to a method via slot over using the temporary entrypoint

CORINFO_ACCESS_NONNULL = 0x0004, // Instance is guaranteed non-null

Expand Down
4 changes: 0 additions & 4 deletions src/coreclr/inc/gfunc_list.h
Original file line number Diff line number Diff line change
Expand Up @@ -13,10 +13,6 @@
DEFINE_DACGFN(DACNotifyCompilationFinished)
DEFINE_DACGFN(ThePreStub)

#ifdef TARGET_ARM
DEFINE_DACGFN(ThePreStubCompactARM)
#endif

DEFINE_DACGFN(ThePreStubPatchLabel)
#ifdef FEATURE_COMINTEROP
DEFINE_DACGFN(Unknown_AddRef)
Expand Down
Loading
Loading