Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add dotnet-extractor tool #2145

Closed

Conversation

JongHeonChoi
Copy link
Contributor

@JongHeonChoi JongHeonChoi commented Apr 5, 2021

Add the dotnet-extractor tool.
(I'd like to get a suitable name. dotnet-extractor? dotnet-converter?)

To use dotnet-extractor, use the following:
dotnet-extractor convert --input <exception_file_path> --assembly <assembly_directories> --pdb <pdb_directories> --output <result_file_path>

# dotnet dotnet-extractor convert -h
convert:
  Get the line number from the token value in the stacktrace

Usage:
  dotnet-extractor convert [options]

Options:
  -i, --input <input_path>            Path to the exception log file (File extension: xxxxx.log)
  -a, --assembly <path1:path2:...>    Multiple paths with assembly directories separated by colon(':')
  -p, --pdb <path1:path2:...>         Path to the pdb directory (Can be omitted if it is the same as the assembly directory path)
  -o, --output <output_path>          Path to the output file (Default: Output to console. If omitted, the xxxxx.out file is created in the same location as the log file)
  -?, -h, --help                      Show help and usage information

Related PR : dotnet/runtime#44013

For easier debugging, if the PDB is not deployed, I want to add the Method Token and IL Offset at the stacktrace.
When the Method Token and *IL Offset are given, the line number can be obtained using PDBs in external development environment.
Applicaiton developer can also add Method Token and IL Offset in the application by directly handle exceptions.

Example:

# cat Exception.log
System.NullReferenceException: Object reference not set to an instance of an object.
   at LineNumber.App.CallA() in LineNumber.dll: token 0x6000001+0x5
   at LineNumber.App.CallB() in LineNumber.dll: token 0x6000002+0x1
#
# dotnet dotnet-extractor convert --input Exception.log
Extraction result:
   at LineNumber.App.CallA() in U:\LineNumber\LineNumber\LineNumber_App.cs:line 14
   at LineNumber.App.CallB() in U:\LineNumber\LineNumber\LineNumber_App.cs:line 29

Output: Exception.out

@JongHeonChoi JongHeonChoi requested a review from a team as a code owner April 5, 2021 05:10
@davmason
Copy link
Member

davmason commented Apr 6, 2021

Hi @JongHeonChoi, I can't actually open the referenced issue dotnet/runtime#44013, it keeps showing as a unicorn for me no matter how many times I try.

Have you discussed the architecture with anyone on the diagnostics team? I personally don't think this makes much sense as a standalone tool, it would make more sense as a mode on an existing tool (such as dotnet-stack).

@noahfalk
Copy link
Member

noahfalk commented Apr 6, 2021

Thanks for the PR @JongHeonChoi! I haven't had an opportunity to dive in just yet but I can probably get to it tomorrow.

@davmason - I too am seeing the unicorn page and I saw it previously too. I hoped it would be transient but it has persisted so I opened a GitHub support ticket for it. I did some review on the PR before the page stopped working.

Have you discussed the architecture with anyone on the diagnostics team?

I discussed it a bit in that other issue. It seemed reasonable to me as a standalone tool to me though I'd be glad to learn more of your thoughts. There is some prior art for us making tools that symbolicate stack traces such as the scenario shown here using StackParser: https://stackoverflow.com/questions/34019714/steps-to-diagnose-translated-uwp-stack-trace

@noahfalk
Copy link
Member

noahfalk commented Apr 6, 2021

@davmason - try using this PR link for 44013 instead that the support team offered, it solved the unicorn issue for me:

https://github.com/dotnet/runtime/pull/44013?timeline_per_page=10

@josalem
Copy link
Contributor

josalem commented Apr 6, 2021

I agree with David. This feels like a good candidate for functionality on dotnet-stack. This could be a new report type on the tool, e.g., dotnet stack report -t symbolize <stack text file>.

@JongHeonChoi
Copy link
Contributor Author

Thank you for comment.

The main requirement of this PR is to provide easy debugging with only the exception log when an exception occurs.
When dotnet/runtime#44013 PR is merged, I think that if an exception occurs, we will see the following logs.

// without PDBs
System.NullReferenceException: Object reference not set to an instance of an object.
   at ILOffsetTest.App.callD() in ILOffsetTest.dll:token 0x6000003+0xe
   at ILOffsetTest.App.callC() in ILOffsetTest.dll:token 0x6000004+0xc
   at ILOffsetTest.App.callB() in ILOffsetTest.dll:token 0x6000005+0xc
   at ILOffsetTest.App.callA() in ILOffsetTest.dll:token 0x6000006+0xc

In generally, CE products(mobile, wearable, TV, refrigerator, etc) do not have enough memory.
So, It is often deployed an application without a PDB file to reduce the deployment size or obfuscate.
Debugging is difficult because the line number is not displayed in the exception log.

Usage:
  dotnet-stack [command]

Commands:
  report       reports the managed stacks from a running .NET process
  ps           Lists the dotnet processes that traces can be collected
  convert      Get the line number from the token value in the stack trace

Anyway, What do you think about moving the functionality to the new Commands type on the dotnet-stack tool?
# dotnet stack convert --input <exception_path> --assembly <assembly_path> --pdb <pdb_path> --output <result_path>

@noahfalk
Copy link
Member

noahfalk commented Apr 8, 2021

I took a look and here are first thoughts:

  1. Discussing the design (as we are doing) is a good place to start before getting into any of the implementation
  2. Upon thinking about it more I agree that dotnet-stack fits conceptually, now my next question is if it would add overly large dependencies. dotnet-stack is a tool we want people to easily deploy on production machines where download size and disk space usage matters somewhat whereas I assume this use case is primarily intended for development machines. The current code is pulling in dependencies for DiaSymReader. I don't recall how large that is?
  3. I do think we'd want to change the name, the current one doesn't imply that it is connected to stack traces or symbols. The most common term I have heard for resolving the IPs using symbol data is symbolication. As a verb, you would "symbolicate" a stack trace. So for example as a standalone tool it might be dotnet-symbolicate or as a sub-command it might be "dotnet-stack symbolicate." Very open to other suggestions too.
  4. I'm concerned that it would be very easy for users to symbolicate a stack using different versions of assemblies or PDBs than what was used to generate the log file. I don't think we have any way to prevent it but we may want to print a warning on the console that using the wrong versions of assembly/pdb will give incorrect results and the tool has no ability to validate what the user provides.

@josalem
Copy link
Contributor

josalem commented Apr 8, 2021

So for example as a standalone tool it might be dotnet-symbolicate or as a sub-command it might be "dotnet-stack symbolicate." Very open to other suggestions too.

We've been pretty picky about what verbs we add to the tools so that we retain some level of uniformity across all of them. In this case, I think you're right, though. At first, I thought it made sense to put it under the report verb, but if we ever decide to move symbolication out of the trace, e.g., we don't do rundown during an eventpipe session, we will need similar functionality for other tools as well.

e.g.,

dotnet trace symbolicate <trace-file> --pdb-search-dir <>
dotnet stack symbolicate <txt file with IL offsets> --pdb-search-dir <>
dotnet dump symbolicate ...

The current code is pulling in dependencies for DiaSymReader. I don't recall how large that is?

That's a good point. @JongHeonChoi can you do a size comparison of dotnet-stack before and after your patch? We should also check the single-file distribution form as well (run the build.{sh|cmd} script with the -bundletools flag).

@noahfalk
Copy link
Member

noahfalk commented Apr 8, 2021

The current code is pulling in dependencies for DiaSymReader. I don't recall how large that is?

That's a good point. @JongHeonChoi can you do a size comparison of dotnet-stack before and after your patch?

To save time I'd suggest there is no need to create a full patch to calculate this. I am fine assuming that the new code being written will have neglible size and the only thing we need to measure is the size of the new binary dependencies. So any change that adds the dependencies to the tool and then builds the bundle should be good enough as a measurement.

@JongHeonChoi
Copy link
Contributor Author

The current code is pulling in dependencies for DiaSymReader. I don't recall how large that is?

That's a good point. @JongHeonChoi can you do a size comparison of dotnet-stack before and after your patch?

To save time I'd suggest there is no need to create a full patch to calculate this. I am fine assuming that the new code being written will have neglible size and the only thing we need to measure is the size of the new binary dependencies. So any change that adds the dependencies to the tool and then builds the bundle should be good enough as a measurement.

Run the script as follows:
PS C:\diagnostics> .\Build.cmd -bundletools
And Is it correct to compare the file size of the path(C:\diagnostics\artifacts\bin\dotnet-stack\Debug\netcoreapp3.1\win-x64\dotnet-stack.exe)? Right?

@JongHeonChoi
Copy link
Contributor Author

JongHeonChoi commented Apr 20, 2021

@noahfalk @josalem
The fundamental reason for this PR is to quickly and easily analyze the cause of the exception by receiving the displayed exception log without pdb when an exception occurs in the already released product.
In other words, it is not a scenario of installing diagnostics on the product and debugging while running the app, but trying to analyze the exception in the host with only assembly, pdb, and log.
Do you think this is the concept for the dotnet-stack of diagnostics?

@josalem
Copy link
Contributor

josalem commented Apr 20, 2021

I went ahead and did a size comparison for adding the diasym* dependencies using dotnet-stack as the base. The nupkg is +0.3MB (1.3MB->1.6MB), the unarchived nupkg is +0.5MB (4.1MB->4.6MB), and the osx-64 single-file bundle is +0.6MB (3.5MB->4.1MB).

This is large relative to the current size of dotnet-stack. I think that is okay, though.

I think dotnet-stack is the appropriate place to put this functionality, specifically under a symbolicate verb (@shirhatti, thoughts on this verb?).

Before we discuss implementation, we should finalize the usage of the tool. Does the following seem agreeable?

Usage:
  dotnet-stack [command]

Commands:
  report       reports the managed stacks from a running .NET process
  ps           Lists the dotnet processes that traces can be collected
  symbolicate  Use Method Tokens and IL Offsets to symbolicate a stack trace

e.g.,

dotnet stack symbolicate <trace/txt with stacks> [--search-dir <dir[;dir[;...]]>]

@JongHeonChoi
Copy link
Contributor Author

JongHeonChoi commented Jul 6, 2021

Before we discuss implementation, we should finalize the usage of the tool. Does the following seem agreeable?

Sorry for replying so late. I agree with your proposal.

dotnet-stack symbolicate <trace/txt with stacks> [--search-dir <dir[;dir[;...]]>]

Additionally I think interactive input could be useful to the developers. Perhaps the tool will be used like this:

  • If you enter the stack string directly.
  1. If both assembly and pdb are in the current directory.
 # dotnet dotnet-stack symbolicate
  Enter the stack string:
     at LineNumber.App.CallA() in LineNumber.dll: token 0x6000001+0x5
     at LineNumber.App.CallB() in LineNumber.dll: token 0x6000002+0x1
  Symbolicate result:
     at LineNumber.App.CallA() in U:\LineNumber\LineNumber\LineNumber_App.cs:line 14
     at LineNumber.App.CallB() in U:\LineNumber\LineNumber\LineNumber_App.cs:line 29

 2. If assembly and pdb are separated in each directory.

 # dotnet dotnet-stack symbolicate --search-dir /opt/usr/assemblies:/opt/usr/pdbs
  Enter the stacks string:
     at LineNumber.App.CallA() in LineNumber.dll: token 0x6000001+0x5
     at LineNumber.App.CallB() in LineNumber.dll: token 0x6000002+0x1
  Symbolicate result:
     at LineNumber.App.CallA() in U:\LineNumber\LineNumber\LineNumber_App.cs:line 14
     at LineNumber.App.CallB() in U:\LineNumber\LineNumber\LineNumber_App.cs:line 29

 3. Added the --output and --help options.
  - --output : Path to the output file (Default: Output to console)
  - --help : Show help and usage information

I think this is the best usage of the tool:

# dotnet dotnet-stack -h
Usage:
  dotnet-stack [command]

Commands:
  report       reports the managed stacks from a running .NET process
  ps           Lists the dotnet processes that traces can be collected
  symbolicate  Use Method Tokens and IL Offsets to symbolicate a stacktrace

#
# dotnet dotnet-stack symbolicate -h
symbolicate:
  Get the line number from the Method Token and IL Offset at the stacktrace

Usage:
  dotnet-stack symbolicate [options]

Options:
  -d, --search-dir <dir1:dir2:...>    Path of multiple directories with assembly and pdb separated by colon(':')
  -o, --output <output_path>          Path to the output file (Default: Output to console)
  -?, -h, --help                      Show help and usage information

Do you agree with me?

@josalem
Copy link
Contributor

josalem commented Jul 6, 2021

I'm curious about the interactive mode. Is there a specific use case where that becomes more helpful than just inputting the file containing the text? My preference would be to only have an input file since that simplifies usage and is more in line with the other tools' usages, but I'm open to adding it. CC @shirhatti

I had a couple minor edits to your proposal, but I think it is good so far.

# dotnet dotnet-stack -h
Usage:
  dotnet-stack [command]

Commands:
  report       reports the managed stacks from a running .NET process
  ps           Lists the dotnet processes that traces can be collected
  symbolicate  Use Method Tokens and IL Offsets to symbolicate a stacktrace

# dotnet dotnet-stack symbolicate -h
symbolicate:
  Get the line number from the Method Token and IL Offset at the stacktrace

Usage:
  dotnet-stack symbolicate [options] <input-file/wildcard>

Options:
  -d, --search-dir <dir1;dir2;...>    Path of multiple directories with assembly and pdb separated by colon(';')
  -o, --output <output_path>          Output directly to a file.
  -?, -h, --help                      Show help and usage information
  • Use ; instead of :
  • Change phrasing of -o flag since it is equivalent to > at the shell rather than stdout being specified as an argument on -o.
  • Require input file and take wildcard. Allows for users to specify something like dotnet stack symbolicate ./*.stacks.txt

@JongHeonChoi
Copy link
Contributor Author

@josalem Thank you.
I will update the code based on your review.

@JongHeonChoi JongHeonChoi deleted the add_dotnet_extractor_tool branch July 14, 2021 03:55
@JongHeonChoi
Copy link
Contributor Author

I changed branch name. And I uploaded a new PR(#2436).

@github-actions github-actions bot locked and limited conversation to collaborators Jan 16, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants