Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: lookup attribute instead of performing a deepcopy #226

7 changes: 2 additions & 5 deletions proto/marshal/collections/repeated.py
Original file line number Diff line number Diff line change
Expand Up @@ -104,11 +104,8 @@ def _pb_type(self):
if len(self.pb) > 0:
return type(self.pb[0])

# We have no members in the list.
# In order to get the type, we create a throw-away copy and add a
# blank member to it.
canary = copy.deepcopy(self.pb).add()
return type(canary)
# We have no members in the list, so we get the type from the attributes.
return self.pb._message_descriptor._concrete_class

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ported this change to googleapis/python-datastore and discovered that these attributes are not universally available.

We might need to do something like this, but then assuming an equivalent change, this would LGTM.

if hasattr(self.pb, '_message_descriptor') and hasattr(self.pb._message_descriptor, '_concrete_class'):
    return self.pb._message_descriptor._concrete_class

canary = copy.deepcopy(self.pb).add()
return type(canary)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these attributes are not universally available.

This is the difference between the cpp and python protobuf runtimes. The concrete type of self.pb is different depending on which runtime is being used; it is not dependent on the API or the client library itself.

The _message_descriptor attribute is apparently considered an implementation detail of the python based protobuf runtime.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I partially followed that (due to my ongoing ignorance of the cpp layer), but to clarify, are you suggesting that this line of code will work everywhere?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change will work for all client libraries IFF the application process is using the python protobuf runtime.

This is the relevant chunk of the tech stack:

  • User application
  • Client library manual layer (optional. Firestore has a manual layer, Texttospeech does not)
  • Generated client library, aka GAPIC
  • Proto plus runtime, i.e. this repo
  • General protobuf runtime, either written in pure python or in cpp as a python extension, which is determined at runtime. Proto plus should be agnostic about which which is being used.

The lowest layer in the above is library is preventing the general fix from merging. The two different implementations provide different unofficial APIs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@software-dov can you expand on the pure python vs cpp protobuf runtime? How would I get to each of these? Which one is used by cloud libraries?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which one is used by cloud libraries?

It is chosen dynamically at runtime based on the platform (linux, macos X amd64, aarch64), version of protobuf installed, and environment variables. To be strictly general and not break cloud, any solution must be compatible with both. I would imagine, based on the environment they're running in, that most user applications tend to use the cpp runtime.

The protobuf runtime is responsible for memory layout, serialization and deserialization, and message introspection. It is the code that allocates memory and performs host-to-network and network-to-host bit conversion.


def __eq__(self, other):
if super().__eq__(other):
Expand Down