fix: lookup attribute instead of performing a deepcopy #226

GabrieleMazzola · 2021-04-13T06:34:49Z

Closes #224.

google-cla · 2021-04-13T06:34:53Z

Thanks for your pull request. It looks like this may be your first contribution to a Google open source project (if not, look below for help). Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please visit https://cla.developers.google.com/ to sign.

Once you've signed (or fixed any issues), please reply here with @googlebot I signed it! and we'll verify it.

What to do if you already signed the CLA

Individual signers

It's possible we don't have your GitHub username or you're using a different email address on your commit. Check your existing CLA data and verify that your email is set on your git commits.

Corporate signers

Your company has a Point of Contact who decides which employees are authorized to participate. Ask your POC to be added to the group of authorized contributors. If you don't know who your Point of Contact is, direct the Google project maintainer to go/cla#troubleshoot (Public version).
The email used to register you as an authorized contributor must be the email used for the Git commit. Check your existing CLA data and verify that your email is set on your git commits.
The email used to register you as an authorized contributor must also be attached to your GitHub account.

ℹ️ Googlers: Go here for more info.

codecov · 2021-04-13T06:35:43Z

Codecov Report

❗ No coverage uploaded for pull request base (main@3148a1c). Click here to learn what that means.
The diff coverage is n/a.

❗ Current head dec4d6f differs from pull request most recent head 9cc0df0. Consider uploading reports for the commit 9cc0df0 to get more accurate results

@@           Coverage Diff            @@
##             main      #226   +/-   ##
========================================
  Coverage        ?   100.00%           
========================================
  Files           ?        22           
  Lines           ?      1004           
  Branches        ?       227           
========================================
  Hits            ?      1004           
  Misses          ?         0           
  Partials        ?         0

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3148a1c...9cc0df0. Read the comment docs.

GabrieleMazzola · 2021-04-13T06:44:30Z

@googlebot I signed it!

software-dov · 2021-04-13T21:38:11Z

It looks like the cpp runtime is causing problems because the type it exposes doesn't have the same python-visible fields. There's a couple of ways we can try solving this problem:

Complain to the protobuf folks and get a generic, runtime independent mechanism for determining the type of the elements of a repeated field.
Have a fast path (proposed change) for the python runtime and try/except to the deepcopy and canary.
Continue to dig deep for a field/mechanism that's faster than deep copy and works on the cpp runtime types.
Something like this (also passes the tests, but I'm not entirely sure it's 100% safe):

--- a/proto/marshal/marshal.py
+++ b/proto/marshal/marshal.py
@@ -155,12 +155,12 @@ class BaseMarshal:
         # Return a view around it that implements MutableSequence.
         value_type = type(value)  # Minor performance boost over isinstance
         if value_type in compat.repeated_composite_types:
-            return RepeatedComposite(value, marshal=self)
+            return RepeatedComposite(value, marshal=self, proto_type=proto_type)
         if value_type in compat.repeated_scalar_types:
             if isinstance(proto_type, type):
                 return RepeatedComposite(value, marshal=self, proto_type=proto_type)
             else:
-                return Repeated(value, marshal=self)
+                return Repeated(value, marshal=self, proto_type=proto_type)

busunkim96 · 2021-04-13T23:51:08Z

How about we go with 2 to unblock folks, and follow up with the protobuf folks to figure out which of 1, 3, or 4 is the best option? Is there a person we can tag in to this PR?

I am somewhat uncomfortable with relying on a private attribute from protobuf, but it seems OK for the short term with a fallback (option 2).

crwilcox · 2021-04-14T16:34:38Z

Also +1 to option 2.

craiglabenz · 2021-04-14T17:32:15Z

Do we have a sense of what percentage of libraries would benefit from this change, versus trip the try/catch and actually run slower? I certainly don't, but I do know that python-datastore raises an AttributeError and would be further slowed.

As I'm thinking about this further, I suppose the answer is to only bump proto-plus versions in libraries that have been confirmed to work smoothly with this change?

I also don't know anything about the cpp layer and would love to learn more.

busunkim96 · 2021-04-14T17:40:28Z

@craiglabenz The cpp option is opt-in according to the documentation, although I saw some comments on issues that seemed to suggest it was default.

https://developers.google.com/protocol-buffers/docs/reference/python-generated#cpp_impl

I think it would be helpful to clarify with a protobuf person who knows for sure.

crwilcox · 2021-04-14T17:44:12Z

We could also check for the attr without a try/catch if we are concerned about the cost of that.

craiglabenz · 2021-04-14T19:21:53Z

I am increasingly suspicious that this deepcopy is not the guilty party. In testing the implications of this change and the snippet proposed in the comment above, I just discovered that all of python-datastore's slowness must be coming from elsewhere, because my scratch code that recreates said slowness never reaches that deepcopy.

software-dov · 2021-04-14T21:26:10Z

Responding to a bunch of things all at once.

Do we have a sense of what percentage of libraries would benefit from this change, versus trip the try/catch and actually run slower?

There's a couple misconceptions in this question. The python/cpp protobuf runtimes are agnostic of the client libraries themselves: they are determined by platform and environment variables. The cpp runtime is much faster in certain real world workloads, but I don't have a good idea of what the breakdown in user environments between the cpp and py runtimes is.
Also, in most cases, try/catch is fast; especially if the straightforward path is more common, it is often faster to handle the common case in try and handle the uncommon in except than to do normal if else chains.

I saw some comments on issues that seemed to suggest [cpp] was default.

This is a little bit complicated. If a cpp runtime for a particular platform exists and is installed, it is now the default version. We can see this via code archaeology in api_implementation.py:40.

I am increasingly suspicious that this deepcopy is not the guilty party.

That's entirely possible. Designing a good, helpful benchmark that mimics real world uses is tricky. I gave up after about two hours trying to recreate the benchmark linked in the issue for this PR. On the other hand, RepeatedField serialization is a known real-world hotspot and have not been highly optimized. Trying to resolve this issue seems like a good idea, especially if it can be done without to much effort or stress.

software-dov · 2021-04-15T20:04:00Z

What are people's required timelines for dealing with this performance regression? Clearly sooner is better, but is anything being blocked, or has conversion to the new client libraries been halted, or anything?

craiglabenz · 2021-04-16T16:56:00Z

proto/marshal/collections/repeated.py

-        canary = copy.deepcopy(self.pb).add()
-        return type(canary)
+        # We have no members in the list, so we get the type from the attributes.
+        return self.pb._message_descriptor._concrete_class


I ported this change to googleapis/python-datastore and discovered that these attributes are not universally available.

We might need to do something like this, but then assuming an equivalent change, this would LGTM.

if hasattr(self.pb, '_message_descriptor') and hasattr(self.pb._message_descriptor, '_concrete_class'): return self.pb._message_descriptor._concrete_class canary = copy.deepcopy(self.pb).add() return type(canary)

these attributes are not universally available.

This is the difference between the cpp and python protobuf runtimes. The concrete type of self.pb is different depending on which runtime is being used; it is not dependent on the API or the client library itself.

The _message_descriptor attribute is apparently considered an implementation detail of the python based protobuf runtime.

I partially followed that (due to my ongoing ignorance of the cpp layer), but to clarify, are you suggesting that this line of code will work everywhere?

The change will work for all client libraries IFF the application process is using the python protobuf runtime.

This is the relevant chunk of the tech stack:

User application

Client library manual layer (optional. Firestore has a manual layer, Texttospeech does not)

Generated client library, aka GAPIC

Proto plus runtime, i.e. this repo

General protobuf runtime, either written in pure python or in cpp as a python extension, which is determined at runtime. Proto plus should be agnostic about which which is being used.

The lowest layer in the above is library is preventing the general fix from merging. The two different implementations provide different unofficial APIs.

@software-dov can you expand on the pure python vs cpp protobuf runtime? How would I get to each of these? Which one is used by cloud libraries?

Which one is used by cloud libraries?

It is chosen dynamically at runtime based on the platform (linux, macos X amd64, aarch64), version of protobuf installed, and environment variables. To be strictly general and not break cloud, any solution must be compatible with both. I would imagine, based on the environment they're running in, that most user applications tend to use the cpp runtime.

The protobuf runtime is responsible for memory layout, serialization and deserialization, and message introspection. It is the code that allocates memory and performs host-to-network and network-to-host bit conversion.

crwilcox · 2021-04-20T22:00:59Z

@software-dov I know of at least one instance where a user has reverted to the previous major version due to this regression relative to the monolith generator.

busunkim96 · 2021-04-22T20:46:32Z

I opened https://groups.google.com/g/protobuf/c/pYcq-UBixqU to get for some clarification from the Protobuf folks on the specific situations protobuf defaults to the C++ implementation (as well as when the change was made).

If you're curious, you can figure out which implementation protobuf is using with this snippet. I get 'cpp' on a fresh install on my linux workstation.

from google.protobuf.internal import api_implementation
print(api_implementation.Type())

Note that protobuf discourages using api_implementation.Type() (comment). That leads me to think think they'd be open to adding a way to fetch the type of elements in a repeated field agnostic of the implementation type. I'll open another issue to start that discussion.

Are folks alright with the code in this PR coupled with a fallback to deepcopy to handle the CPP case (try/except, if/else, attribute checking)? It sounds like it made a significant difference for the PR author in a real world scenario. I imagine it will be a net positive for some folks and wouldn't make things worse for anyone else.

…te-lookup

stefanbluegrid · 2022-03-24T07:51:35Z

Hello guys, what is happening with this PR? If it can be fixed with simple try/except or if statement, I don't see reason for not being merged.

…-lookup

parthea · 2022-04-29T23:39:44Z

Waiting for #312

…-lookup

fix: lookup attribute instead of performing a deepcopy

31a2a62

GabrieleMazzola requested a review from a team as a code owner April 13, 2021 06:34

google-cla bot added the cla: no This human has *not* signed the Contributor License Agreement. label Apr 13, 2021

google-cla bot added cla: yes This human has signed the Contributor License Agreement. and removed cla: no This human has *not* signed the Contributor License Agreement. labels Apr 13, 2021

busunkim96 requested a review from software-dov April 13, 2021 14:43

craiglabenz suggested changes Apr 16, 2021

View reviewed changes

busunkim96 mentioned this pull request Apr 24, 2021

Python: Implementation agnostic way to determine type of the elements of an empty repeated field protocolbuffers/protobuf#8529

Closed

Merge branch 'master' into fix/replace-repeated-deepcopy-with-attribu…

896c140

…te-lookup

parthea assigned busunkim96 Jul 19, 2021

parthea assigned parthea and unassigned busunkim96 Apr 29, 2022

address review feedback

51ee0ad

GabrieleMazzola requested a review from a team as a code owner April 29, 2022 22:51

Merge branch 'main' into fix/replace-repeated-deepcopy-with-attribute…

dec4d6f

…-lookup

parthea removed the request for review from software-dov April 29, 2022 22:52

parthea mentioned this pull request Apr 29, 2022

ci: refactor coverage check #312

Merged

parthea added 2 commits May 2, 2022 05:31

Merge branch 'main' into fix/replace-repeated-deepcopy-with-attribute…

37321e5

…-lookup

Merge branch 'main' into fix/replace-repeated-deepcopy-with-attribute…

9cc0df0

…-lookup

parthea approved these changes May 2, 2022

View reviewed changes

parthea added the owlbot:run Add this label to trigger the Owlbot post processor. label May 2, 2022

gcf-owl-bot bot removed the owlbot:run Add this label to trigger the Owlbot post processor. label May 2, 2022

parthea merged commit e469059 into googleapis:main May 2, 2022

release-please bot mentioned this pull request May 2, 2022

chore(main): release 1.20.4 #313

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: lookup attribute instead of performing a deepcopy #226

fix: lookup attribute instead of performing a deepcopy #226

GabrieleMazzola commented Apr 13, 2021

google-cla bot commented Apr 13, 2021

codecov bot commented Apr 13, 2021 •

edited

Loading

GabrieleMazzola commented Apr 13, 2021

software-dov commented Apr 13, 2021

busunkim96 commented Apr 13, 2021

crwilcox commented Apr 14, 2021

craiglabenz commented Apr 14, 2021

busunkim96 commented Apr 14, 2021 •

edited

Loading

crwilcox commented Apr 14, 2021

craiglabenz commented Apr 14, 2021

software-dov commented Apr 14, 2021

software-dov commented Apr 15, 2021

craiglabenz Apr 16, 2021

software-dov Apr 16, 2021

craiglabenz Apr 20, 2021

software-dov Apr 21, 2021

crwilcox Apr 22, 2021

software-dov Apr 22, 2021

crwilcox commented Apr 20, 2021

busunkim96 commented Apr 22, 2021

stefanbluegrid commented Mar 24, 2022

parthea commented Apr 29, 2022

fix: lookup attribute instead of performing a deepcopy #226

fix: lookup attribute instead of performing a deepcopy #226

Conversation

GabrieleMazzola commented Apr 13, 2021

google-cla bot commented Apr 13, 2021

What to do if you already signed the CLA

Individual signers

Corporate signers

codecov bot commented Apr 13, 2021 • edited Loading

Codecov Report

GabrieleMazzola commented Apr 13, 2021

software-dov commented Apr 13, 2021

busunkim96 commented Apr 13, 2021

crwilcox commented Apr 14, 2021

craiglabenz commented Apr 14, 2021

busunkim96 commented Apr 14, 2021 • edited Loading

crwilcox commented Apr 14, 2021

craiglabenz commented Apr 14, 2021

software-dov commented Apr 14, 2021

software-dov commented Apr 15, 2021

craiglabenz Apr 16, 2021

Choose a reason for hiding this comment

software-dov Apr 16, 2021

Choose a reason for hiding this comment

craiglabenz Apr 20, 2021

Choose a reason for hiding this comment

software-dov Apr 21, 2021

Choose a reason for hiding this comment

crwilcox Apr 22, 2021

Choose a reason for hiding this comment

software-dov Apr 22, 2021

Choose a reason for hiding this comment

crwilcox commented Apr 20, 2021

busunkim96 commented Apr 22, 2021

stefanbluegrid commented Mar 24, 2022

parthea commented Apr 29, 2022

codecov bot commented Apr 13, 2021 •

edited

Loading

busunkim96 commented Apr 14, 2021 •

edited

Loading