Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add IL Offset and Method Token to stacktrace #44013

Merged
merged 14 commits into from
Apr 12, 2021

Conversation

wscho77
Copy link
Contributor

@wscho77 wscho77 commented Oct 29, 2020

Add IL Offset and Method Token info to the stacktrace if PDBs are not deplyed.

It is often deployed without a PDB file to reduce deploying size or for obfuscation.
If deployed without PDB like this, debugging becomes difficult because the line number are not displayed in the exception stacktrace.

For easier debugging, if the PDB is not deployed, I want to add the Method Token and IL Offset at the stacktrace.
When the Method Token and IL Offset are given, the line number can be obtained using PDBs in external development environment.
Applicaiton developer can also add Method Token and IL Offset in the application by directly handle exceptions.
However, to apply this functionality to the entire application of the platform, it is necessary to add this at the runtime side.

The functionality can be turned on and off with the below switch name, and it is truned on by default.

Switch.System.Diagnostics.StackTrace.ShowILOffsets

You can set the value of switch with AppContext.SetSwitch() method.

(Example)

// with PDBs
System.NullReferenceException: Object reference not set to an instance of an object.
   at ILOffsetTest.App.callD() in C:\ILOffsetTest\ILOffsetTest\ILOffsetTest_App.cs:line 49
   at ILOffsetTest.App.callC() in C:\ILOffsetTest\ILOffsetTest\ILOffsetTest_App.cs:line 54
   at ILOffsetTest.App.callB() in C:\ILOffsetTest\ILOffsetTest\ILOffsetTest_App.cs:line 59
   at ILOffsetTest.App.callA() in C:\ILOffsetTest\ILOffsetTest\ILOffsetTest_App.cs:line 64

// without PDBs
System.NullReferenceException: Object reference not set to an instance of an object.
   at ILOffsetTest.App.callD()
   at ILOffsetTest.App.callC()
   at ILOffsetTest.App.callB()
   at ILOffsetTest.App.callA()

// with this PR
System.NullReferenceException: Object reference not set to an instance of an object.
   at ILOffsetTest.App.callD() in ILOffsetTest.dll:token 0x6000003+0xe
   at ILOffsetTest.App.callC() in ILOffsetTest.dll:token 0x6000004+0xc
   at ILOffsetTest.App.callB() in ILOffsetTest.dll:token 0x6000005+0xc
   at ILOffsetTest.App.callA() in ILOffsetTest.dll:token 0x6000006+0xc

@Dotnet-GitSync-Bot Dotnet-GitSync-Bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Oct 29, 2020
@dnfadmin
Copy link

dnfadmin commented Oct 29, 2020

CLA assistant check
All CLA requirements met.

@marek-safar
Copy link
Contributor

Should we do such changes always via feature switches to make them universally available? Why COMPlus_ILOffsetToStackTrace name?

@wscho77
Copy link
Contributor Author

wscho77 commented Nov 2, 2020

Should we do such changes always via feature switches to make them universally available? Why COMPlus_ILOffsetToStackTrace name?

In many embedded system case, the PDBs cannot be deployed together due to insufficient storage issue.
I think this feature will be very helpful for debugging on this kind of system.

The environment variable name is set as such because it is a feature that adds il offset to the stacktrace when there is no line information. If there are better suggestions, it can be changed.

@ghost
Copy link

ghost commented Dec 19, 2020

Tagging subscribers to this area: @tommcdon
See info in area-owners.md if you want to be subscribed.

Issue Details

Add IL Offset and Method Token info to the stacktrace if PDBs are not deplyed.

It is often deployed without a PDB file to reduce deploying size or for obfuscation.
If deployed without PDB like this, debugging becomes difficult because the line number are not displayed in the exception stacktrace.

For easier debugging, if the PDB is not deployed, I want to add the Method Token and IL Offset at the stacktrace.
When the Method Token and *IL Offset are given, the line number can be obtained using PDBs in external development environment.
Applicaiton developer can also add Method Token and IL Offset in the application by directly handle exceptions.
However, to apply this functionality to the entire application of the platform, it is necessary to add this at the runtime side.

The functionality can be turned on and off with the environment variable COMPlus_ILOffsetToStackTrace.

(Example)

// with PDBs
System.NullReferenceException: Object reference not set to an instance of an object.
   at ILOffsetTest.App.callD() in C:\ILOffsetTest\ILOffsetTest\ILOffsetTest_App.cs:line 49
   at ILOffsetTest.App.callC() in C:\ILOffsetTest\ILOffsetTest\ILOffsetTest_App.cs:line 54
   at ILOffsetTest.App.callB() in C:\ILOffsetTest\ILOffsetTest\ILOffsetTest_App.cs:line 59
   at ILOffsetTest.App.callA() in C:\ILOffsetTest\ILOffsetTest\ILOffsetTest_App.cs:line 64

// without PDBs
System.NullReferenceException: Object reference not set to an instance of an object.
   at ILOffsetTest.App.callD()
   at ILOffsetTest.App.callC()
   at ILOffsetTest.App.callB()
   at ILOffsetTest.App.callA()

// with this PR
System.NullReferenceException: Object reference not set to an instance of an object.
   at ILOffsetTest.App.callD() in ILOffsetTest.dll: methodToken(0x6000003), ILOffset(0xe)
   at ILOffsetTest.App.callC() in ILOffsetTest.dll: methodToken(0x6000004), ILOffset(0xc)
   at ILOffsetTest.App.callB() in ILOffsetTest.dll: methodToken(0x6000005), ILOffset(0xc)
   at ILOffsetTest.App.callA() in ILOffsetTest.dll: methodToken(0x6000006), ILOffset(0xc)
Author: wscho77
Assignees: -
Labels:

area-CodeGen-coreclr, area-Diagnostics-coreclr

Milestone: -

@AndyAyersMS AndyAyersMS removed the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Dec 19, 2020
@danmoseley
Copy link
Member

@tommcdon this is one of our older PR's - I wonder whether we could assign someone to shepherd it? that will show up on https://aka.ms/dotnetruntimepulls and you can track there if you find that helpful..

Base automatically changed from master to main March 1, 2021 09:07
@mikem8361 mikem8361 self-requested a review March 8, 2021 21:21
@mikem8361
Copy link
Member

Can you rebase this on the latest "main" and push>

if (s_ilOffsetToStackTrace)
{
sb.Append(' ');
string methodToken = mb.MetadataToken.ToString("x");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MetadataToken is not always available and it can throw exceptions.

For example, it is not available for DynamicMethods. This property throws System.InvalidOperationException on DynamicMethods.

Metadata tokens are also not available in some flavors of AOT.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for review. I'll check those case and fix it

sb.Append(' ');
string methodToken = mb.MetadataToken.ToString("x");
string ilOffset = sf.GetILOffset().ToString("x");
sb.AppendFormat(CultureInfo.InvariantCulture, inFileILOffset, mb.Module.Name, methodToken, ilOffset);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Module.Name will just return <Unknown> in single file mode, and in other cases where the module is not loaded from physical file on disk. Would it better to use Assembly simple name instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you said, assembly simple name looks better.

@jkotas
Copy link
Member

jkotas commented Mar 9, 2021

Should we do such changes always via feature switches to make them universally available?

I agree that enabling this via environment variable won't work for number of cases.

The meta point behind this and other feedback: Is this just a quick undocumented hack that does not work half of the time, or should this be a proper feature, with tests, documentation, etc.?

@wscho77
Copy link
Contributor Author

wscho77 commented Mar 9, 2021

Should we do such changes always via feature switches to make them universally available?

I agree that enabling this via environment variable won't work for number of cases.

The meta point behind this and other feedback: Is this just a quick undocumented hack that does not work half of the time, or should this be a proper feature, with tests, documentation, etc.?

Considering a case where there is a limitation in ROM size like a CE device or an IoT device, I think that this should be a proper feature. In addition, if this feature is merged, we want to add a functionality to getting line information from methodToken and ILOffset via diagnostics. What do you think about this?

{
sb.Append(' ');
string methodToken = mb.MetadataToken.ToString("x");
string ilOffset = sf.GetILOffset().ToString("x");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ToString use current culture which is huge dependency that can fail, you should pass CultureInfo.InvariantCulture in.

Nonetheless, these are complex calls and can fail so it'd be better to handle exceptions too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll fix it

@@ -18,6 +18,7 @@ public partial class StackTrace
{
public const int METHODS_TO_SKIP = 0;

private static readonly bool s_ilOffsetToStackTrace = Environment.GetEnvironmentVariable("COMPlus_ILOffsetToStackTrace") == "1";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it'd be better to avoid static constructor in StackTrace

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A static constructor was used to avoid unnecessarily checking the environment variable multiple times. Could this be a problem?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this could cause problems when for example Environment .cctor crashes. Using env variables will also require them to be set every time or won't work on deployments where setting env variables are not supported/allowed. I think using https://github.com/dotnet/runtime/blob/main/src/libraries/System.Private.CoreLib/src/System/LocalAppContextSwitches.cs would be better.

/cc @jkotas

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code that gets the Environment value from the static constructor is already used in various runtimes.
see :

private static readonly bool s_shouldWriteToStdErr = Environment.GetEnvironmentVariable("COMPlus_DebugWriteToStdErr") == "1";

Also, if there is a crash in Environment .cctor, it is thought that there is a problem in the system already. Should this be considered?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@marek-safar All review contents have been applied except for the use of static constructor. If it is necessary to change the static constructor to LocalAppContextSwitches, I will apply that as well. Please give me your opinions.

Copy link
Contributor

@marek-safar marek-safar Mar 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

StackTrace is used (and usable) for crashes in the Environment as well so creating circular type initialization dependency between these types is not desirable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I'll change it to use LocalAppContextCaches.

@@ -3349,6 +3349,9 @@
<data name="SpinWait_SpinUntil_TimeoutWrong" xml:space="preserve">
<value>The timeout must represent a value between -1 and Int32.MaxValue, inclusive.</value>
</data>
<data name="StackTrace_InFileILOffset" xml:space="preserve">
<value>in {0}: methodToken(0x{1}), ILOffset(0x{2})</value>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we use the parentheses syntax anywhere in .NET. Also not sure why ILOffset and not IL.

Copy link
Contributor Author

@wscho77 wscho77 Mar 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used the parentheses syntax because it seems more recognizable. Is there a better way to recommend it?

For example,

at ILOffsetTest.App.callD() in ILOffsetTest.dll: methodToken(0x6000003), ILOffset(0xe)
-->
at ILOffsetTest.App.callD() in ILOffsetTest.dll: methodToken `0x6000003`, ILOffset : `0xe`
or
at ILOffsetTest.App.callD() in ILOffsetTest.dll: methodToken 0x6000003, ILOffset 0xe

ILOffset is a property name of StackFrame. I think that ILOffset is more clear than IL because it is a value representing the offset from the method starting point.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd go with simple

at ILOffsetTest.App.callA() in test.dll:token 0x6000003(0xE)

@jkotas thoughts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The debuggers and other diagnostic tools typically use + to denote the code offset. I do ever remember seeing the code offset in (...). Is there any existing tool that prints it like that?

Also, I would omit the space between "token" and the token value to make the two more connected.

There two suggestions combined together would make it:

at ILOffsetTest.App.callA() in test.dll:token6000003+0xE

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The last suggestion looks good. I will fix it like that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I test the last suggestion, it is difficult to recognize if there is no space.

at ILOffsetTest.App.callA() in ILOffsetTest.dll: token0x6000006+0xc

How about add a space after token?

at ILOffsetTest.App.callA() in ILOffsetTest.dll: token 0x6000006+0xc

Line information also has a space between line and number as shown below.

at ILOffsetTest.App.callA() in C:\ILOffsetTest\ILOffsetTest\ILOffsetTest_App.cs:line 64

Comment on lines 327 to 329
else
{
if (s_ilOffsetToStackTrace)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fold this into else if

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll fix it.

@wscho77 wscho77 force-pushed the add_il_offset_to_stacktrace branch from 0d3ce1b to 5b6de18 Compare March 10, 2021 23:21
Comment on lines 331 to 332
string methodToken = mb.MetadataToken.ToString("x", CultureInfo.InvariantCulture);
string ilOffset = sf.GetILOffset().ToString("x", CultureInfo.InvariantCulture);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, this could be better done with HexConverter

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you. It seems better to use Convert like below.

    string methodToken = Convert.ToString(mb.MetadataToken, 16);
    string ilOffset = Convert.ToString(sf.GetILOffset(), 16);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please avoid Convert too, it's unnecessary complex. The code could be as simple as

Span<byte> bytes = stackalloc byte[4];
BinaryPrimitives.WriteUInt32BigEndian(bytes, mb.MetadataToken);
string methodToken = HexConverter.ToString(bytes.Slice(0, 4), HexConverter.Casing.Lower);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll modify it to use HexConverter like your suggestion.

@danmoseley
Copy link
Member

The example I linked is backwards. Here is a good example

[ActiveIssue("https://github.com/dotnet/runtime/issues/35927", typeof(PlatformDetection), nameof(PlatformDetection.IsMonoInterpreter))]

It would contain a link to the new issue you would open.

@wscho77
Copy link
Contributor Author

wscho77 commented Apr 12, 2021

I cannot find "PlatformDetection.IsNotMonoInterpreter" in the latest runtime code.
Could I add IsNotMonoInterpreter in the PlatformDetection.cs? or just add checking code in the TC code with IsMonoInterpreter like below?

        public void ToString_ShowILOffset()
        {
            if (PlatformDetection.IsMonoInterpreter)
                return;

            ....
        }

@wscho77
Copy link
Contributor Author

wscho77 commented Apr 12, 2021

I cannot find "PlatformDetection.IsNotMonoInterpreter" in the latest runtime code.
Could I add IsNotMonoInterpreter in the PlatformDetection.cs? or just add checking code in the TC code with IsMonoInterpreter like below?

        public void ToString_ShowILOffset()
        {
            if (PlatformDetection.IsMonoInterpreter)
                return;

            ....
        }

Oh, I can skip the test with [ActiveIssue("xxxx", typeof(PlatformDetection), nameof(PlatformDetection.IsMonoInterpreter))].
Please ignore my question :)

@wscho77
Copy link
Contributor Author

wscho77 commented Apr 12, 2021

The example I linked is backwards. Here is a good example

[ActiveIssue("https://github.com/dotnet/runtime/issues/35927", typeof(PlatformDetection), nameof(PlatformDetection.IsMonoInterpreter))]

It would contain a link to the new issue you would open.

I create issue and updated TC to skip in mono interpreter mode.

@danmoseley
Copy link
Member

BTW due to Github bug, to open this PR I have to adjust the URL like this: #44013

@danmoseley danmoseley merged commit 19f0ea7 into dotnet:main Apr 12, 2021
@danmoseley
Copy link
Member

Thank you @wscho77 for the contribution. We would welcome further contributions if you are interested.

@danmoseley
Copy link
Member

It will be interesting to see whether we get feedback when people see this in Preview 4. I would imagine the scenario would be callstacks displayed to end users, or received by developers by some telemetry mechanism, since that is the case that is likely missing PDB's.

@wscho77
Copy link
Contributor Author

wscho77 commented Apr 12, 2021

BTW due to Github bug, to open this PR I have to adjust the URL like this: #44013

To avoid page loading time out (Github bug), I added "?timeline_per_page=5" after link. :)
For example,

https://github.com/dotnet/runtime/pull/44013?timeline_per_page=5

wscho77 added a commit to wscho77/runtime that referenced this pull request Apr 15, 2021
In order to avoid failure due to modification of dotnet#44013,
the TC has been modified so that the content after':' is not compared.
BruceForstall pushed a commit that referenced this pull request Apr 15, 2021
* Modify TC to compare strings before ':'

In order to avoid failure due to modification of #44013,
the TC has been modified so that the content after':' is not compared.

* Enable OOMException01

Failed TC is fixed at #51280
@ghost ghost locked as resolved and limited conversation to collaborators May 12, 2021
@karelz karelz added this to the 6.0.0 milestone May 20, 2021
@mangod9
Copy link
Member

mangod9 commented Jul 29, 2021

Agreed I like AppContext better as the config mechanism. At this point I'd keep this off-by-default but it certainly feels plausible we'd change the default to this in the future or that it might differ by app-model. I expect some users that would appreciate this capability and others who would think it was confusing or clutter. Its not clear to me which group is larger but in lieu of more feedback/some user research I'd err towards keeping the status quo for our defaults.

We have a sequence of previews designed to get such feedback: if it is opt in, I suspect almost nobody will use it so there will be no feedback. I believe we should switch it on in preview 4 (which snaps in a couple weeks or so) with a flag of some sort to switch it off. In preview 6 or 7, make a decision whether to have it on or off by default, and whether to retain the flag or not.

What has the feedback around this been like, looks like its still enabled by default? Doesn't feel adding the token info has any value as part of the exception stack for most scenarios?

@noahfalk
Copy link
Member

I haven't seen any feedback either for it or against it. That could mean nobody cared, or nobody is using the relevant scenario, or I am not seeing the right set of feedback. If anyone else has seen feedback I'd love to hear of it.

@danmoseley
Copy link
Member

We had assumed the number of customers that would benefit from it is quite small (making sense of IL offsets) but those that do really need it - so perhaps we shouldn't expect much positive feedback. But I haven't seen negative feedback either.

My suggestion is to leave it as-is, given we have the appcontext flag, and then based on feedback we can next cycle either leave as is, reverse the default, or just remove the flag.

@MichalStrehovsky
Copy link
Member

MichalStrehovsky commented Aug 2, 2021

My 2 cents: the extra info makes it more likely a stack trace line will span two lines instead of a single line in the console output, making the stack harder to read. Here's a stack trace I was looking at the other day and still had the console window open:

image

5 lines out of 7 are wrapped; 2 4 of those are only wrapped because of the extra info I don't particularly care about.

@MichalStrehovsky
Copy link
Member

Correction: 4 lines out of 5 wrapped are caused by this. I just realized this change added the module name as well. So I would only see a single wrapped line out of 7 without this change.

That's why it felt so jarring first time I saw it, I guess.

@mangod9
Copy link
Member

mangod9 commented Aug 2, 2021

I have a similar feeling around the extra info. Stack traces are frequently included in application logs, so this would increase log sizes. Ideally this should be opt-in (esp. the token+offset)

@danmoseley
Copy link
Member

Some thoughts.. ultimately just guesses.

Whenever there's been discussion about what should appear in stack traces, I have seen community feedback has been that the more info we include the better as often they're all the info available to them to trace some elusive issue.

Line numbers or equivalent are important, viz the perennial request to know which object in a single line of code caused a NRE.

Changing the setting would in many cases mean redeploying an app to one's customers, since the scenario where there's no symbols is likely in customer deployment.

I would expect most logs have relatively few stack traces in, or you have bigger problems than the log size?

@noahfalk
Copy link
Member

noahfalk commented Aug 2, 2021

I have seen community feedback has been that the more info we include the better

We've also seen the community complain about stack trace text being overly noisy/confusing and then being very happy when work was done to simplify it (for example Ben Adams added the StackTraceHidden attribute to remove frames).

I'm not convinced this feature will be easy for most devs to use even if we leave it on by default. Converting the IL offsets back into source line offsets requires the developer gives a path to the correct build's PDBs. Unlike dump debugging we have no way to warn them if they specify the wrong ones (because there is no module versioning info in the output text).

I also lean towards saying this behavior should be off by default for this release as we continue to get feedback. We could always blog about this or put it in our docs as alternate ways to spread awareness.

@danmoseley
Copy link
Member

Fair enough!

@mangod9
Copy link
Member

mangod9 commented Aug 3, 2021

PR to disable by default: #56780. @wscho77 please review.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.