-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Non-PGO calli/delegate call transformation #109679
Conversation
@MihuBot -dependsOn 108579 |
@MihuBot -dependsOn 108579 |
@MihuBot -dependsOn 108579 |
@jkotas It seems to me like the CONTRACT_ASSERT has some bug where it crashes with some non null terminated string instead of properly asserting, should I open an issue for it?
|
Yes, it looks like a bug somewhere. Do you have a link to the crashdump or are you able to tell where the non-terminated string comes from? |
The crash is from the |
Multiple things went wrong:
The non-null terminated string is a random pointer that came from the corrupted I have tried to introduce an artificial contract violation locally and it was reported fine (no
This command is mentioned in the how-to-debug-dump.md doc that is part of the test artifacts: https://dev.azure.com/dnceng-public/public/_build/results?buildId=865877&view=ms.vss-test-web.build-test-results-tab&runId=22530538&paneView=dotnet-dnceng.dnceng-build-release-tasks.helix-test-information-tab&resultId=214160 |
I think the current version should be fine now, could you check if all looks good with the new JIT-EE call? |
src/coreclr/vm/jitinterface.cpp
Outdated
@@ -5066,7 +5101,7 @@ void CEEInfo::getCallInfo( | |||
} | |||
else | |||
{ | |||
if (!exactType.IsTypeDesc() && !pTargetMD->IsArray()) | |||
if (!exactType.IsTypeDesc() && !pTargetMD->IsArray() && !pTargetMD->IsDynamicMethod()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jkotas CoreCLR is asserting here on DynamicMethod
without adding a check here, is this the right solution? With it we can emit direct calls to them but they're still not inlined cause VM says they're DONTINLINE
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the assert?
Dynamic methods can call each other today. This code is able to handle that. How is the case that you are introducing different?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the assert?
It's the null return assert inside of this if.
How is the case that you are introducing different?
The MethodTable*
from the DynamicMethod is different from the one this method obtains from the JIT which in turn gets it from getMethodClass(replacementMethod)
. It seems like getMethodClass
causes the pointer to be normalized to the real one while the VM here has a special DynamicMethodTable*
which causes GetExactDeclaringType
to fail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not following. DynamicMethodTable
and MethodTable
are unrelated types. DynamicMethodTable
should never show up in the JIT/EE interface calls.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not following.
DynamicMethodTable
andMethodTable
are unrelated types.DynamicMethodTable
should never show up in the JIT/EE interface calls.
I mean, the JIT is passing the proper class handle (the class the DynamicMethod is being attached to, which it gets from getMethodClass
), but the method table stored in the dynamic MethodDesc is a custom one with the debug name for it being dynamicClass
, which makes GetExactDeclaringType
fail cause that seems to have no relation to the parent type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, so the problem is that getMethodClass
returns the owning class handle for dynamic methods, but CORINFO_RESOLVED_TOKEN
is expected to have the actual class handle for dynamic methods.
I think it would be best to fix it in the place that fills in CORINFO_RESOLVED_TOKEN
to have the actual class handle. It will make the dynamic methods path introduced by this PR work the same way as the existing paths taken by the dynamic methods.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, so the problem is that
getMethodClass
returns the owning class handle for dynamic methods, butCORINFO_RESOLVED_TOKEN
is expected to have the actual class handle for dynamic methods.I think it would be best to fix it in the place that fills in
CORINFO_RESOLVED_TOKEN
to have the actual class handle. It will make the dynamic methods path introduced by this PR work the same way as the existing paths taken by the dynamic methods.
How would I grab that from the JIT? This is where I grab that currently with getMethodClass
.
EDIT: Would you suggest to add it to the getMethodFromDelegate
JIT-EE api?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also to be honest, this seems to me a bit like CoreCLR is leaking implementation details of itself to the JIT, for me it'd make more sense if it always returned the handle of the owner class to the JIT and then mapped back to its dynamic one on its end reliably. Otherwise we're polluting the JIT with behaviours specific to one VM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that CoreCLR is leaking the implementation details here, but I have different opinion about what the correct behavior should be.
The dynamic method type owner is optional and it is only meant to be used for visibility checks. All visibility checks should be done on the EE side. I am not sure why the JIT ever needs to see the type owner. I think it would make more sense for the JIT to always see the special MethodTable that the dynamic methods are attached to.
Would you suggest to add it to the getMethodFromDelegate JIT-EE api?
I would think so. It may also help you with communicating precise type to the JIT. I think MethodDesc*
may not always have the precise type in case of shared generic code - notice that Delegate.GetMethodImpl
does extra work to figure out the precise type for generic types https://github.com/dotnet/runtime/blob/main/src/coreclr/System.Private.CoreLib/src/System/Delegate.CoreCLR.cs#L169-L170.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would think so. It may also help you with communicating precise type to the JIT. I think
MethodDesc*
may not always have the precise type in case of shared generic code - notice thatDelegate.GetMethodImpl
does extra work to figure out the precise type for generic types https://github.com/dotnet/runtime/blob/main/src/coreclr/System.Private.CoreLib/src/System/Delegate.CoreCLR.cs#L169-L170.
I can't find a case where this is an issue, the JIT seems to always have the proper generic context to resolve stuff so I wonder if this is just a reflection only issue?
If not, could you assist me with handling this cause I couldn't figure out how to replicate this logic in VMs?
@@ -106,7 +106,6 @@ public TypeDesc Type | |||
public DelegateInfo(TypeDesc delegateType, DelegateFeature features) | |||
{ | |||
Debug.Assert(delegateType.IsDelegate); | |||
Debug.Assert(delegateType.IsTypeDefinition); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if this is safe but NativeAOT otherwise asserts here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It isn't, the implementation is non-sensical for instantiated types.
Is this only used to get signature of the delegate? If so, replace this:
DelegateInfo delegateInfo = _compilation.TypeSystemContext.GetDelegateInfo(calledType);
int paramCountDelegateClosed = delegateInfo.Signature.Length + 1;
with this:
int paramCountDelegateClosed = calledType.GetMethod("Invoke", null).Signature.Length + 1;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made it check _firstParameter
instead since AFAIR I can just do that on NativeAOT since it seems to not support closed static delegates or instance ones closed over null.
src/coreclr/tools/aot/ILCompiler.RyuJit/JitInterface/CorInfoImpl.RyuJit.cs
Outdated
Show resolved
Hide resolved
Copying and slightly editing part of my note from #108579 (comment)
Thinking about it a bit more, you might also be able to re-express this as a special case of GDV, where the analysis you do here creates a guarded delegate invoke, and then later JIT optimizations prove it's the only possibility and remove the guard test and alternative paths. @jakobbotsch might have been hinting at something like this recently. |
Sorry for the delay, I had a cold for the past few days so I didn't really feel like going back to this but now I will. I disagree with the idea that doing the transformation early is a bad idea. While doing it early on misses some cases, doing it early lets us apply all optimizations possible, while doing it later on would miss for example all optimizations done by forward-sub and earlier phases. Additionally, doing it early is still possible for a considerable about of cases including single-def locals. Additionally, doing early expansion does not block adding late expansion in any way. In fact, I'd say that having both an early expansion here for the easy cases and then a second part later on that would at first just transform the call and later on fully inline it when your proposed changes would be implemented would be the best end solution for this as it'd handle both the easy cases optimally and harder to find ones would still be handled although not as good. The main issue with my approach here is not being able to see through calls to stuff returned from methods, but AFAIR that should only be a noticeable issue with properties and such and those could be handled by something like #96325. As for compelling data on improvements, I assume that doing #85014 first would be required for obtaining such. For now, I'm afraid that One thing worth to note here is that this'd help with hiding the current limitations of PGO like the profile data not being collected per callsite, being able to remove known callsites here from the data would help with keeping it clean for other places too.
Today I've realised that'd only really be useful for handling multi-def locals. |
The failed test here is a GC hole with GC info according to the crash dump from the pipeline but I don't see anything immediately wrong with the JIT-EE API, it appears to be using |
The best first step is to find as much as possible from the crash dump collected by the CI. What did you find from the crash dump about the crash? |
@jkotas I've realised a fundamental issue with devirtualizing delegates without runtime checks: delegates can be fully legally mutated by explicitly calling the delegate cc @AndyAyersMS does PGO have any unguarded cases for delegates that would break with this? EDIT: After some more looking around, I noticed the ECMA having this: |
Why do think that it is legal? It does not sound legal to me. If it was legal to call constructors multiple times (on the types provided by the runtime in particular), things would get very interesting. |
Was it Linux-targeting Windows-hosted CrossDAC? You typically have to download this DAC manually from the special build leg that builds it (or just build it locally). Alternatively, you can debug natively on Linux using LLBD. The CrossDAC is not needed for that. |
Calling constructors multiple time is legal in the ECMA, it's only said to be not CLS compliant there. Even calling |
Does it explicitly say that it is legal somewhere? Calling constructors (via I agree that ECMA wording could have been more precise. We can fix via the augments doc if needed |
/azp run runtime-coreclr gcstress-extra, runtime-coreclr gcstress0x3-gcstress0xc |
Azure Pipelines successfully started running 2 pipeline(s). |
Ah missed the fact I was not using a CrossDAC.
What other commands would be useful here? |
For instance constructors, as far as I can see the ECMA only says that they're instance methods that are used for
Which says that it's non CLS compliant so one could assume it's only invalid for that and legal otherwise. For type initializers the ECMA says doesn't forbid it anywhere but says this:
which seems to directly suggest that calling them 2nd time explicitly from user code is legal. Since this is already broken with the |
The unmanaged callstack that you have shared earlier shows that the crash occurs during enumerating of GC root of some thread. You need to find the thread and method that the GC roots are enumerated for, and than switch to that thread in the debugger. I expect that you will find a method with bad GC info that was introduced by your change. |
Multiple initializations of instances and types might violate runtime invariants, we should forbid it then as discussed in dotnet#109679. Users are not expected to have been relying on the behaviour being legal, especially since multiple type initializations are already resulting in invalid behaviour due to JIT optimizations.
Draft Pull Request was automatically closed for 30 days of inactivity. Please let us know if you'd like to reopen it. |
I've decided to split my earlier attempt from #85197 into first adding just the transformation and only then making it find more cases with substitution.
TODO before merge:
TODO after merge:
DynamicMethod
s asDONTINLINE
.