-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: get rid of the path for fully qualified names. #912
fix: get rid of the path for fully qualified names. #912
Conversation
3e7ed83
to
5c87e1c
Compare
field_type_normalized = type_field.element_type.removeprefix( | ||
"." | ||
).removeprefix( # getting rid of the optional `.` before the package name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
all the fully qualified references always start with a .
, like for the example of: .google.protobuf.Timestamp
that its referring to the package google.protobuf
.
In this case normalizing it by removing the .
can lead to shadowing if the import of the google message its after another import that defines a scope that contains google, like for e.g.:
package my.awesome.tricky.company.google.protobuf;
message Timestamp {
// definition...
}
Even if this is possible its unlikely to happen, we may even reduce that removal of the .
just for types that do not have other anchestors in type type trie
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implemented it by the type tree and implemented a test
a469d30
to
0faa236
Compare
Coverage reportClick to see where and how coverage changed
This report was generated by python-coverage-comment-action |
This is not complete normalization feature but rather more a fix for the fully qualified paths vs simple name references. This fix not manage the various way a type can be expressed with a partial path. Long explanation below: If we accept the specifications from buf as correct, we can look at how a type reference is defined [here](https://protobuf.com/docs/language-spec#type-references). The main problem with the previous implementation is that the current parser can't tell if a fully-qualified type reference in dot notation is the same as one identified by name alone aka simple name notation ([fully qualified reference](https://protobuf.com/docs/language-spec#fully-qualified-references)). The fix do not consider all the different ways users can define a relative reference, schemas with different way of expressing a [relative reference](https://protobuf.com/docs/language-spec#relative-references) even if normalized, for now will keep being considered different. Right now, our logic removes the `.package_name` (+ the Message scope) part before comparing field modifications (in fact it re-writes the schema using the simple name notation). Even though the TypeTree (trie data structure) could help resolve [relative references](https://protobuf.com/docs/language-spec#relative-references), we don't want to add this feature in the python implementation now because it might cause new bugs due to the non-trivial behaviour of the protoc compiler. We plan to properly normalize the protobuf schemas later. We'll use protobuf descriptors (after the compilation and linking step) to gather type references already resolved, and we will threaten all the protobuf using always the fully qualified names. Properly handling all path variations means reimplementing the protoc compiler behavior, we prefer relying on the already processed proto descriptor. So, for now, we'll only implement a normalization for the fully-qualified references in dot notation and by simple name alone. **NB:** This is not changing the semantics of the message since the local scope its always the one with max priority, so if you get rid of the fully-qualified reference protoc will resolve the reference with the one specified in the package scope.
0faa236
to
d229909
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍
This is not complete normalization feature but rather more a fix for the fully qualified paths vs simple name references. This fix not manage the various way a type can be expressed with a partial path.
Long explanation below:
If we accept the specifications from buf as correct, we can look at how a type reference is defined here.
The main problem with the previous implementation is that the current parser can't tell if a fully-qualified type reference in dot notation is the same as one identified by name alone aka simple name notation (fully qualified reference).
The fix do not consider all the different ways users can define a relative reference, schemas
with different way of expressing a relative reference even if normalized,
for now will keep being considered different.
Right now, our logic removes the
.package_name
(+ the Message scope) partbefore comparing field modifications (in fact it re-writes the schema using the simple name notation).
Even though the TypeTree (trie data structure) could help resolve relative references, we don't want to add this feature in the python implementation now because it might cause new bugs due to the non-trivial behaviour of the protoc compiler.
We plan to properly normalize the protobuf schemas later.
We'll use protobuf descriptors (after the compilation and linking step) to gather type references already resolved, and we will threaten all the protobuf using always the fully qualified names.
Properly handling all path variations means reimplementing the protoc compiler behavior, we prefer relying on the already processed proto descriptor.
So, for now, we'll only implement a normalization for the fully-qualified references in dot notation and by simple name alone.
NB: This is not changing the semantics of the message since the local scope its always the one
with max priority, so if you get rid of the fully-qualified reference protoc will resolve
the reference with the one specified in the package scope.
About this change - What it does
References: #xxxxx
Why this way