Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Differences in SourceCodeInfo Locations between desc/protoparse and protoc #212

Closed
bufdev opened this issue May 27, 2019 · 3 comments
Closed

Comments

@bufdev
Copy link
Contributor

bufdev commented May 27, 2019

Discussed offline - there's still differences between the SourceCodeInfo Locations between desc/protoparse and protoc.

I think the easiest way to share this is via the FileDescriptorSets that are outputted. I ran goprotoc and protoc for https://github.com/googleapis/googleapis/tree/master/google/api/experimental (a small package in googleapis) to make this easier.

Reproduction is at https://github.com/pedgeio/goprotoctest/blob/9c184ad4659b6672cb1798014b5aa6ecc8809463/internal/goprotoctest/goprotoctest_test.go#L47 - if you set up the repository per the README.md, comment out the t.Skip call, and run make test, you will get two JSON files, one at internal/goprotoctest/tmp/goprotoc.json and one at internal/goprotoctest/tmp/protoc.json.

The contents of those files are at https://gist.github.com/pedgeio/bc1e343e5cca4c424a374eb6e8aa3182 - note that that gist has two files, you just have to scroll way down to get to protoc.json.

Differences I notice at first glance:

  • Different intermediate locations
  • Some comments have an extra newline in protoc.json
@jhump
Copy link
Owner

jhump commented May 28, 2019

I think I see what they are doing. I just re-read the spec (e.g. comments for SourceCodeInfo) and see that they emit multiple locations with the same path for "ancestor" paths, instead of trying to compute a single location that spans the entire range for all child paths.

So, for path [8] (file options), instead of a single path with a span that covers all options (lines 20 to 24), they emit multiple such locations, one for each option declared. Unfortunately, it looks like this behavior is inconsistent. For example, I don't see it doing this for field options at all; that is likely a bug in protoc.

I think the only element that spans whole regions is the empty path, which indicates the range of the entire file.

So I think I can get rid of the recent change I made and replace it with something that is likely a little simpler.

Other differences I see:

  1. As you already pointed out, protoc preserves a trailing newline in leading comments.
  2. For message and enum fields, the type location is attributed to the FieldDecriptorProto's type_name field, not its type field. (Now that I look at it, that seems intuitive and is just a bug in goprotoc).
  3. There appears to be a bug in protoc. For fields in "proto3" syntax that do not include a label (e.g. optional fields), it is emitted a source code location for the label field anyway, and it looks wrong. Here's an example or this field in http.proto:
    "span": [
      41,
      2,
      33,
      30
    ]
    The "end line" value is less than the "start line" value. The actually corresponds to the span after the end of the previous element up to the beginning of this field. I am pretty certain this is a bug and that we do not want to reproduce this behavior (but instead file a bug and maybe even provide a patch to the main protobuf repo).

@jhump
Copy link
Owner

jhump commented Jul 17, 2019

BTW, for the things above that I said are bugs in protoc, there are now bugs filed against that project:
protocolbuffers/protobuf#6378
protocolbuffers/protobuf#6392

@jhump
Copy link
Owner

jhump commented Jul 19, 2019

I think I've fixed nearly all of these variances in a PR that did a fairly substantial overhaul of how source code info is generated: #240.

I'm closing this. The only remaining variances I see are behavior in protoc that I don't think I want this package to perfectly emulate (they are inconsistencies with how protoc treats the "default" and "json_name" pseudo-options; protoparse treats them consistent with other options). I'll file questions with the main protobuf project to get some clarity -- maybe they'll result in fixes in protoc, so it would then more closely match the protoparse package.

@jhump jhump closed this as completed Jul 24, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants