Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(Portable) Source Link #12759

Closed
tmat opened this issue Jul 26, 2016 · 65 comments
Closed

(Portable) Source Link #12759

tmat opened this issue Jul 26, 2016 · 65 comments

Comments

@tmat
Copy link
Member

tmat commented Jul 26, 2016

To complete the E2E experience with Portable PDBs (embedded or not), the debugger needs to be able to find out the sources for documents included in the PDB. In case of Windows PDB a post-build rewriter tool injects the source document URLs into the PDB.

Rewriting is not desirable. I propose we add a new compiler switch /sourcelink:<file> that tells the compiler to embed a blob into the PDB describing how to retrieve sources from a source control server.
This technique could be used for both Windows and Portable PDBs, although we would initially only support the switch when /debug:portable or /debug:embedded is also specified.

In case of Portable PDBs the content of the source link file would be in JSON format with the following structure:

{
  "documents": {
     "f:/build/*" : "https://raw.githubusercontent.com/OmniSharp/omnisharp-vscode/14fbdd1897c2993c0a08e224c8eb7c36b82e8cb8/*"
  }
}

The content of the file would be generated by a build task preceding csc task. The build task would be specific to source control used by the project (git, TFS, etc.) and would query the source control for the current commit sha and generate the corresponding URL to the .json file.

Implementation details

In Portable PDB the source link data would be saved in a Custom Debug Information blob associated with the assembly.

The Portable PDB bridge would implement method already exposed on ISymUnmanagedReader4:

public int GetSourceServerData(out byte* data, out int size)

It would read the CDI and return the raw content. The debugger would interpret the data.

An alternative approach

would be to embed the source link file into the .dll itself as a managed resource (/resource:source_link.json). Although this would avoid compiler switch it would require a convention for the resource name. Unless we use some kind of GUID the resource name might collide with some resource that the project already has. Also it's odd that this information would be in the .dll and not where all other debug information is - in the .pdb (embedded or not). At the end of the day a dedicated compiler switch is imo cleaner.

@tmat tmat added this to the 2.0 (RC) milestone Jul 26, 2016
@tmat tmat self-assigned this Jul 26, 2016
@tmat
Copy link
Member Author

tmat commented Jul 26, 2016

@tmat
Copy link
Member Author

tmat commented Jul 26, 2016

@dotnet/roslyn-compiler Any objections? @jaredpar I know that adding a new compiler option is not popular, but amending output of the compiler to avoid that is much worse imo.

@tmat
Copy link
Member Author

tmat commented Jul 26, 2016

@ctaggart FYI, related to #4119.

@ctaggart
Copy link
Contributor

@tmat Thanks for adding me. I like the name of the option! 👍 This is indeed what we discussed last November. Very excited to see progress on it.

Regarding E2E experience, it is also important that private repositories, such as GitHub private repositories (ctaggart/SourceLink#48), be indexable too. GitHub private repos do not return a HTTP 401. However, you know ahead of time that they are secure and the clients could prompt for credentials before making the request. Would it be possible to specify that it needed to be secure in the json document for certain documents? May be standard types of security can be supported?

@tmat
Copy link
Member Author

tmat commented Jul 26, 2016

@ctaggart Just to be clear, the content of the file is opaque to the compiler. The interpretation is done in the debugger. I assume we're gonna support authentication in some form at some point. @gregg-miskelly ?

@gregg-miskelly
Copy link

Yes, I think we would eventually want to support authentication. At the moment though, I am not quite sure how this would work. @ctaggart do you know much about how GitHub private authentication works? Is the auth connection token part of the URL? a cookie? how could a client program integrate with them?

@noahfalk
Copy link
Member

I'm good the proposal. I agree that there is little reason to put this file in the dll.

Aside: Some day the debug content that might be more useful in the dll is a list of file names and hashes. This allows a debugger to quickly determine if a breakpoint that the user has bound in an open file in the IDE matches the file used to compile the debuggee. Today the debugger needs to pause the debuggee at each module load to acquire+load the symbols in order to do this check.

@ctaggart
Copy link
Contributor

@gregg-miskelly GitHub private authentication is documented here https://developer.github.com/v3/auth/. The Basic Authentication support is what I used to SourceLink-Proxy to enable private repos to work with today's source indexing. Supporting their standard OAuth flow would be better.

@mjsabby
Copy link
Contributor

mjsabby commented Jul 27, 2016

@tmat how are you doing the embedded pdb inside the dll? Is that a well-known resource name?
Adding a compiler switch means a precipitous drop from who will be using it.

@tmat
Copy link
Member Author

tmat commented Jul 27, 2016

@mjsabby It's a debug table entry: see spec https://github.com/dotnet/corefx/blob/master/src/System.Reflection.Metadata/specs/PE-COFF.md#embedded-portable-pdb-debug-directory-entry-type-17

Not sure what you mean. The idea is that to use this feature one would use an msbuild task that produces the ,json file and passes it into the csc task. For git it could be something like:

  <Target Name="GenerateSourceLink">
    <Exec Command="git rev-parse HEAD" >
      <Output TaskParameter="ConsoleOutput" PropertyName="LatestCommit" />
    </Exec>
    <WriteLinesToFile File="source_link.json" Lines='{"documents": { "$(OutputDir)/*" : "$(RepoUri)/$(LatestCommit)/*" }}' />
 </Target>  

@mjsabby
Copy link
Contributor

mjsabby commented Jul 27, 2016

@tmat I was talking about your alternative where you said the one thing preventing putting this inside the dll is a well-known name. And I was pointing out that the same problem exists in embedding the PDB inside the dll, so however you solved that (if you did) would possibly apply here.

@tmat
Copy link
Member Author

tmat commented Jul 27, 2016

We didn't have a problem with well-known name since we used debug directory to embed the pdb.

@tmat
Copy link
Member Author

tmat commented Sep 3, 2016

Remaining work:

  • Debugger support
  • msbuild task/snippet sample to generate the .json file from GIT repo
  • E2E testing in .NET Core projects

@AArnott
Copy link
Contributor

AArnott commented Nov 21, 2016

I see $(SourceLink) is already wired up to the Csc task. I'm interested in enhancing PdbGit to set this property and skip the PDB rewrite, but I'd need to know a couple things:

  1. How to detect during the build that $(SourceLink) is supported, since I would end up skipping PDB rewriting later when this is the case. I don't want to just set an MSBuild property and assume the customer is using the version of the .targets that support it. So what property or other clue can I check for to know it will be honored?
  2. Capitalization is critical in git. GitHub, for instance, is case sensitive in its URLs. There have been multiple reports on GitLink of some source files not downloading properly that were traceable to capitalization being different in the MSBuild project file's @(Compile) items. Will the compiler construct the URLs from the normalized capitalization on disk (including each level of folder names)? That wouldn't guarantee it works (such as in the event of a case rename in git that doesn't update the disk). In PdbGit I solve this by querying .git at HEAD for each individual file to make certain I have the proper casing. But I don't see that as an option here, unless that .json file allows for supplying each path.

@jaredpar
Copy link
Member

So what property or other clue can I check for to know it will be honored?

I don't know that such a feature exists. I definitely understand what you're asking for here as it's a problem we've run into as well.

Will the compiler construct the URLs from the normalized capitalization on disk (including each level of folder names)?

There are a number of items to consider here:

  • MSBuild Csc task:
    • Will attempt to normalize the provided source files based on Path.GetFullPath here
    • Will not attempt any normalization on paths passed via response files.
  • csc compiler: doesn't do any normalization

@AArnott
Copy link
Contributor

AArnott commented Nov 21, 2016

Will attempt to normalize the provided source files based on Path.GetFullPath

Beware: that API only normalizes the leaf name. It does nothing for the folders that the leaf item belongs to. So you have to call GetFullPath for each path segment.

Will not attempt any normalization on paths passed via response files.

The response files are already created by the .targets, aren't they? And they don't contain normalized paths AFAIK. So how does that solve the problem?

csc compiler: doesn't do any normalization

That's fine, perhaps for perf reasons for everyone who doesn't care about linking a PDB to a git repo. But it means we need to solve the problem elsewhere.

@tmat
Copy link
Member Author

tmat commented Nov 21, 2016

If there is a problem with normalization I think the best place to address it is in msbuild.

Note that /sourcelink specifies a mapping from local paths as stored in the PDB to the repo URLs. The mapping also supports suffix wildcard as shown in the example above (in the description of the PR). When applied by the debugger the mappings are sorted by the key length, so if there is a more specific mapping and wildcard mapping that overlap the more specific is chosen (e.g. "C:\A\B\foo.cs": x and "C:\A\*": y\* then debugger maps C:\A\B\foo.cs to x and C:\A\B\bar.cs to y\B\bar.cs).

So perhaps you can explicitly include the paths whose file name differs in the repo from the disk path.

@gregg-miskelly
Copy link

In case this feeds into the calculus - I would personally be in favor our source link support eventually coming to Windows PDBs. But the support would only be in new debuggers. So project owners would likely want to be able to choose between sourcelink vs. source server vs. both for Windows PDB projects.

@tmat
Copy link
Member Author

tmat commented Feb 6, 2017

@gregg-miskelly SourceLink for Windows PDB is certainly an option.

I'd like to emphasize a major difference between /sourcelink and the current source linking tools -- that is while the current tools post-process the PDB, /sourcelink is a compiler option that is specifically designed to avoid the need for rewriting the PDB. The tools may be able to share a basic logic dealing with GIT but the /sourcelink tools won't need to deal with PDBs at all. Hence the same tool could be used for both PDB formats (assuming /sourcelink is supported in Windows PDBs at some point).

@ctaggart
Copy link
Contributor

I hope to have the tool for creating the source link out later this week. In the spirit of writing the tests first, I just posted a prerelease of dotnet-sourcelink that can test all the checksums in a Portable PDB.

PS C:\Users\camer\cs\sourcelink-test\src\ClassLibrary1> dotnet sourcelink
SourceLink: Source Code On Demand

Usage: dotnet sourcelink [options] [command]

Options:
  -h|--help  Show help information

Commands:
  print-documents  print the documents stored in the Portable PDB file
  print-json       print the Source Link JSON stored in the Portable PDB file
  print-urls       print the URLs for each document based on the Source Link JSON
  test-urls        test each URL and verify that the checksums from the Portable PDB match

Use "dotnet sourcelink [command] --help" for more information about a command.

PS C:\Users\camer\cs\sourcelink-test\src\ClassLibrary1> dotnet sourcelink test-urls -p .\bin\Debug\netstandard1.4\ClassLibrary1.pdb
4dceb9b1cc934a16f10e766eab0497ef1d18fde0 sha1 csharp C:\Users\camer\cs\sourcelink-test\src\ClassLibrary1\Class1.cs
https://raw.githubusercontent.com/ctaggart/sourcelink-test/b5012a98bed12f6704cb942e92ba34ccdbd920d8/src/ClassLibrary1/Class1.cs
<DotNetCliToolReference Include="dotnet-sourcelink" Version="[2.0.0-b335]" />

Progress.
https://github.com/ctaggart/SourceLink

@tmat
Copy link
Member Author

tmat commented Feb 20, 2017

@ctaggart Cool. dotnet tools usually pick up the project context from the .csproj in the current directory. Perhaps the PDB should be picked up from the project rather than being passed on command line -p .\bin\Debug\netstandard1.4\ClassLibrary1.pdb?

/cc @srivatsn

@tmat
Copy link
Member Author

tmat commented Feb 20, 2017

@srivatsn Do we have design guidelines for dotnet tools? How dotnet tools should correspond to msbuild task, etc.

@AArnott
Copy link
Contributor

AArnott commented Feb 20, 2017

This is a matter I've been discussing with @ctaggart. He is of the opinion that SourceLink should ship as a CLI tool, with a simple MSBuild wrapper. I'm advocating for just MSBuild integration so that it's totally automatic.

MSBuild doesn't (yet?) support loading MSBuild Tasks and their dependencies from restored packages the way CLI tools do though, which is what @ctaggart is focusing on. I workaround this by embedding my dependencies.

@srivatsn
Copy link
Contributor

Adding @blackdwarf @piotrpmsft @livarcocc for guidance around CLI tools.

@ctaggart
Copy link
Contributor

ctaggart commented Feb 26, 2017

I built two dotnet tools that are wrapped by msbuild nuget packages. The combination works really well. It makes it easy to test, develop, and extend.

SourceLink v2 is Feature Complete

I just released it to the NuGet Gallery. Please try it out! There are no known bugs, but not a ton of docs. It really isn't very hard though. Have a look at:

https://github.com/ctaggart/SourceLink

There is also an example of putting the references in a separate file here:
https://github.com/ctaggart/sourcelink-test/tree/master/src/ClassLibrary1

PS C:\Users\camer\cs\sourcelink-test\src\ClassLibrary1> dotnet build /p:ci=true /v:n
Microsoft (R) Build Engine version 15.1.545.13942
Copyright (C) Microsoft Corporation. All rights reserved.

Build started 2/25/2017 6:07:38 PM.
     1>Project "C:\Users\camer\cs\sourcelink-test\src\ClassLibrary1\ClassLibrary1.csproj" on node 1 (Build target(s)).
     1>SourceLinkCreate:
         dotnet sourcelink-git create -u "https://raw.githubusercontent.com/stffabi/sourcelink-test/{commit}/*" -f "obj\sourcelink.json" -s "Class1.cs"
       GenerateTargetFrameworkMonikerAttribute:
       Skipping target "GenerateTargetFrameworkMonikerAttribute" because all output files are up-to-date with respect to the input files.
       CoreGenerateAssemblyInfo:
       Skipping target "CoreGenerateAssemblyInfo" because all output files are up-to-date with respect to the input files.
       CoreCompile:
       Skipping target "CoreCompile" because all output files are up-to-date with respect to the input files.
       SourceLinkTest:
         dotnet sourcelink test "obj\Debug\netstandard1.4\ClassLibrary1.pdb"
         sourcelink test passed
       GenerateBuildDependencyFile:
       Skipping target "GenerateBuildDependencyFile" because all output files are up-to-date with respect to the input files.
       CopyFilesToOutputDirectory:
         ClassLibrary1 -> C:\Users\camer\cs\sourcelink-test\src\ClassLibrary1\bin\Debug\netstandard1.4\ClassLibrary1.dll
     1>Done Building Project "C:\Users\camer\cs\sourcelink-test\src\ClassLibrary1\ClassLibrary1.csproj" (Build target(s)).

Build succeeded.
    0 Warning(s)
    0 Error(s)

Time Elapsed 00:00:04.21

@ctaggart
Copy link
Contributor

If ya'll would like to add source link support to Roslyn itself instead of just source server support, you will need to upgrade from project.json to the new dotnet csproj format. Then you do something similar to what I demonstrated here: cc @jaredpar

dotnet/efcore#7726

Docs revised: https://github.com/ctaggart/SourceLink

@tmat
Copy link
Member Author

tmat commented Feb 26, 2017

@ctaggart Thanks. We can't switch Roslyn to Portable PDBs yet as we don't have symbol server support. We are working on a solution for that.
I'm gonna try source link on http://github.com/dotnet/symreader-converter repo though.

@ctaggart
Copy link
Contributor

@tmat The plan is to allow Portable PDB files to be packaged in the nupkg files. I understand ya'll may have unique requirements. Would you mind sharing why you need a symbol server instead?

Let me know how things work out for symreader-converter.

We now need a story for private repository support.

@tmat
Copy link
Member Author

tmat commented Feb 26, 2017

The plan is to allow Portable PDB files to be packaged in the nupkg files. I understand ya'll may have unique requirements. Would you mind sharing why you need a symbol server instead?

@ctaggart It's not "instead" it's rather "as well". Having Portable PDBs in the nupkg is step 1, but that doesn't address all scenarios. For example, when we are debugging a VS crash dump the debugger needs to find from the PDB IDs stored in that .dmp file the PDBs somehow. For Windows PDBs it does so thru symsrv protocol. We are currently figuring out the best approach for Portable PDBs. Lack of this feature means that the developer needs to manually locate every single PDB when debugging a crash dump and that's very inconvenient.

@tmat
Copy link
Member Author

tmat commented Feb 26, 2017

We now need a story for private repository support.

That's certainly an issue that needs to be addressed. One option is to add support for authentication to source link feature. @caslan @gregg-miskelly to comment on this.

Another option for the package author is to embed sources into Portable PDB instead of using source link. Source embedding is supported in VS 2017. Although the sources are compressed in the PDB it makes it bigger.

If NuGet supported a concept similar to satellite assemblies for other content than resource assemblies, that is downloading parts of nupkg on demand, I can imagine creating a "source package" containing only sources and associating it with the main nupkg. Then the source map in source link .json file could map to relative paths. For example,

Main package:

lib\netstandard1.5\mylib.dll
lib\netstandard1.5\mylib.pdb

Source package:

src\MyLib\Lib.cs

When installed in nuget cache:

lib\netstandard1.5\mylib.dll
lib\netstandard1.5\mylib.pdb
src\MyLib\Lib.cs

Source link:

{
  "documents": {
     "f:/build/*" : "../../src/*"
  }
}

(Relative paths would be treated as relative to the .pdb file)

@ctaggart
Copy link
Contributor

ctaggart commented Feb 26, 2017

For private repository support, I was thinking of a couple of different options:

1. say what is supported in the sourcelink.json

{  
   "auth":[  
      "basic",
      "preemptive"
   ],
   "documents":{  
      "C:\\Users\\camer\\cs\\EntityFramework\\*":"https://raw.githubusercontent.com/ctaggart/EntityFramework/f36ce80a93d9f698ff6b70207d2b83b5b11cf93b/*"
   }
}

2. register source link support auth providers

An extension in Visual Studio would register that it can provide authentication for raw.githubusercontent.com.

@tmat
Copy link
Member Author

tmat commented Feb 26, 2017

@ctaggart I'm not familiar with github auth but we need to be careful to only include information in the PDB that's unlikely to change. It would be unfortunate if a change in github policy broke all existing private repo PDBs.

@gregg-miskelly
Copy link

gregg-miskelly commented Feb 27, 2017

RE: auth - our tentative plan was to handle <not found | unauthorized | forbidden | etc> status codes and allow the user to authenticate and/or use a credential store where the user can somehow associate credentials with a URL. Though we haven't had the chance to get back and prototype this yet.

@ctaggart
Copy link
Contributor

@gregg-miskelly That would be perfect! Can't wait. I need this for our repositories at work. When GitHub returns a 404 not found, it would pop open the credentials dialog, then try again. /cc @shiftkey @haacked @kzu @cloudRoutine @ryangribble

@tmat I agree that we should put the auth info in the source link json.

@tmat
Copy link
Member Author

tmat commented Mar 3, 2017

I think at this point we can close this issue as we have an end-to-end solution for GIT with Source Link. @ctaggart any plans to support other source control systems?

@tmat tmat closed this as completed Mar 3, 2017
@ctaggart
Copy link
Contributor

ctaggart commented Mar 3, 2017

@tmat I welcome contributions. https://github.com/ctaggart/SourceLink/wiki/Contributing
I made dotnet sourcelink and dotnet sourcelink-git separate for a reason. Ideally, others would be created.

@tmat
Copy link
Member Author

tmat commented Mar 3, 2017

@ctaggart Cool. I was just curious. Didn't expect you'd write support for all source control systems out there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

10 participants