-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
VM: Include generic instantiations in stack frames #96440
Conversation
Tagging subscribers to this area: @tommcdon Issue DetailsWe currently don't have generic instantiations in the stack frames, which sometimes makes it hard to locate an exception by only seeing the stack trace without actually debugging the program. Adding generic instantiations in the stack frames to resolve this issue. Test code: using System.Diagnostics;
new Foo<int, string>().Bar<float, string>();
class Foo<T, U>
{
public void Bar<V, W>()
{
Console.WriteLine(new StackTrace());
}
} Before:
After:
For a shared generic argument, it returns the canonical type as is because we didn't encode the exact type info in the crawl stack frame. But still better then nothing.
|
src/libraries/System.Diagnostics.StackTrace/tests/StackFrameTests.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Diagnostics.StackTrace/tests/StackFrameTests.cs
Outdated
Show resolved
Hide resolved
b411c6f
to
869f439
Compare
aa027dd
to
2048e41
Compare
Now it's ready for review. |
I start to think that reporting
|
Fatal CLR errors capture details that are important for the developers of the runtime because they're runtime bugs. They are not expected to be hit/consumed/processed by users. Mono has entirely different outputs for fatal runtime errors with even more confusing details. Stacks in fatal errors serve different purposes for different audiences. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The user facing behavior seemed fine to me though I still want to nail down what happens for optimized shared generic code frames.
In terms of the correctness and performance of the type system generics and reflection code I'm guessing @davidwrighton or @jkotas is probably a better person than I to look at that part.
@@ -40,27 +46,27 @@ public void StackTraceTest() | |||
} | |||
} | |||
|
|||
[MethodImpl(MethodImplOptions.NoInlining)] | |||
[MethodImpl(MethodImplOptions.NoOptimization | MethodImplOptions.NoInlining)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't want to weaken the existing stackwalking test by only validating it against unoptimized code. You could create some alternate frames that are unoptimized and call those instead if the switch is set.
(The same thing applies in the other tests)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also think we should explicitly have a test that demonstrates the behavior of a stack trace that occurs in optimized shared generic code where the shared generics context is no longer in scope. Even if it doesn't have exact type naming we still want to ensure it degrades gracefully rather than doing anything bad (like crashing).
Was there ever an agreed upon behavior for what text gets shown for the generic parameters in an optimized shared generic frame? I believe the text "System.__Canon" has already leaked out in a few other places such as the error handling text @hez2010 mentioned or it has shown up in VS debugger callstacks for many years when debugging optimized generic code:
I feel like the cat is already out of the bag in terms of the revealing that type name so using it here is aligned with precedent elsewhere. I'd also be fine reverting to use the type parameter name if you think its worth trying to get the cat back into the bag @jkotas?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do not leak __Canon
through public APIs. We have always treated leaking __Canon
through public APIs as bug.
I agree with #96440 (comment) .
@@ -116,8 +116,16 @@ public static IEnumerable<object[]> ToString_TestData() | |||
yield return new object[] { new StackFrame(), "MoveNext at offset {offset} in file:line:column {fileName}:{lineNumber}:{column}" + Environment.NewLine }; | |||
yield return new object[] { new StackFrame("FileName", 1, 2), "MoveNext at offset {offset} in file:line:column FileName:1:2" + Environment.NewLine }; | |||
yield return new object[] { new StackFrame(int.MaxValue), "<null>" + Environment.NewLine }; | |||
yield return new object[] { GenericMethod<string>(), "GenericMethod<T> at offset {offset} in file:line:column {fileName}:{lineNumber}:{column}" + Environment.NewLine }; | |||
yield return new object[] { GenericMethod<string, int>(), "GenericMethod<T,U> at offset {offset} in file:line:column {fileName}:{lineNumber}:{column}" + Environment.NewLine }; | |||
if (AppContext.TryGetSwitch("Switch.System.Diagnostics.StackTrace.ShowGenericInstantiations", out var showGenericInstantiations) && showGenericInstantiations) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see this code that checks for the switch, but is there something which is setting the switch? I want to ensure the test is explicitly validating the behavior both with the switch enabled and disabled.
fProfilerRequiresGenericsContextForEnterLeave = TRUE; | ||
} | ||
END_PROFILER_CALLBACK(); | ||
methInfo->options = CorInfoOptions(methInfo->options|CORINFO_GENERICS_CTXT_KEEP_ALIVE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AndyAyersMS - Do you know if the JIT is currently preserving the generic context when optimizations are disabled somehow? I expected this would be something we already do but looking around at how CORINFO_GENERICS_CTXT_KEEP_ALIVE is used now that didn't seem to be the case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not.
Generally speaking the generic context is only kept alive if we see the CORINFO_GENERICS_CTXT_KEEP_ALIVE
flag from the jithost, or if there is a runtime lookup that uses the context.
Reliability of this opt-in switch is my top concern. This change takes a dependency on generic context being tracked correctly in new places. If we want do productize this switch, we need to have a test strategy for how we ensure that it is reliable. |
Seems that something around EH has changed after my last main branch merging resulting in GC hole issue happening again. |
While quite interesting that the AV only happens on Windows x86. Investigating. |
4f3b326
to
58ec764
Compare
I was trying to load the exact @jkotas Do you have any suggestion on this? |
InstantiatedMethodDesc *pInstMD = InstantiatedMethodDesc::FindLoadedInstantiatedMethodDesc(th.GetMethodTable(), pConstructedFunc->GetMemberDef(), Instantiation(), false); | ||
|
||
if (pInstMD != NULL) | ||
{ | ||
pConstructedFunc = pInstMD; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should really be
pConstructedFunc = InstantiatedMethodDesc::FindOrCreateExactClassMethod(th.GetMethodTable(), pConstructedFunc);
but this would possibly throw or trigger a GC.
Avoid changing behavior of regular stackwalk - it would introduce both reliability bugs and performance regressions. Any extra work should be done only during the stackwalk that produces the diagnostic stacktrace and only when the special mode is enabled. This goes back to my comment #96440 (comment) . I will leave it to @dotnet/area-system-diagnostics area owners on they would like to approach it. In the current shape, the feature added by this PR addresses very niche scenario (it only helps when running debug build outside debugger, and one has to opt-in). The cost to productize it just may not be worth it. |
Hi @hez2010, I feel bad saying this after the PR has been open for so long but I don't think this one is going to work out. I was investigating and talking a bit with Jan and Andy - the problem looks deeper than just testing. In order for this approach to work the generic context needs to be reliable always and that isn't a guarantee the JIT currently makes in its tracking info. Even if great testing found every case where it doesn't currently work, its a cost both now and in the future for the entire JIT codebase to enforce stronger rules around that tracking. This feature appears too niche to make that tradeoff worthwhile. Since we didn't see any clear path to overcoming the reliability issue I think our best option is to close this PR. Sorry to be the bearer of bad news. |
Okay. I'm closing the PR. |
We currently don't have generic instantiations in the stack frames, which sometimes makes it hard to locate and identify an exception by only seeing the stack trace without actually debugging the program.
Adding generic instantiations in the stack frames to resolve this issue.
Test code:
Before:
After:
This can also help const generics (#89730) if we add the support for it in the future, as displaying the const value of const type arguments in the stack trace could be really helpful while diagnosing an exception.