-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bigtable Python Client Max Row size is 4mb #2880
Comments
A few gRPC users have had this kind of problem and we've started a discussion in For now you can cross your fingers and see what happens when you pass a channel options value like If it doesn't work, then you've hit some limit inside gRPC that isn't overridable and you'll have to break up the large message into a stream of smaller ones. |
@nathanielmanistaatgoogle The "you" in "when you pass a channel options value" here is the library maintainers, yes? As a user, @brandon-white isn't directly creating any gRPC object, he is dealing with our Bigtable classes (i.e. the nouns) and calling the API via their instance methods (i.e. the verbs). |
Thanks @nathanielmanistaatgoogle I'll try this but, ideally, I would prefer a solution where I do not edit the Bigtable Python client. For the Google Bigtable Client folks, have you encountered any use cases where the rows and cells are large? If so, how do you break up these large rows and cells into streams? Are large rows and cells supported through the Python Client? |
@mbrukman Have you run into this in other clients? |
While there's a per-response limit in gRPC, it is possible to retrieve larger cell values via streaming responses and reconstructing them client-side. |
Thank you @mbrukman ! I don't see these calls in the BigTable Python Client. Does this mean I need to write my own client if I want to consume these large cell values? Are there any plans to incorporate this into existing clients? |
The python client already deals with ReadRowResponse under the covers. This seems like something we need to change in the service. |
I'll take that back. This seems like a client setting. In the java client, we set the value on the netty channel (maxMessageSize) to be 256MB. Is that a setting that the python client controls? |
@sduskis I looked for that but didn't see any option to set that in the Python client. So it seems like we need to add this max_message_size config to the BigTable Python Client? |
@dhermes, do you have any idea where grpc settings are set, and if max_message_size is a settable property? |
I do not, though @nathanielmanistaatgoogle would be a good person to ask. I'm not 100% clear on if "grpc settings" are a global thing or a per-call setting? The |
It looks like the channel options get set in I think We might ultimately want to set the max send message length for Bigtable also. It looks like it is currently unlimited, but there is a TODO there to change that. |
So do I need to make the PR to this or do any of the committers have bandwidth to take a look at this setting? What is the process for getting fixes like this out? |
Hello @brandon-white! |
@daspecster Thank you! I do not really have any custom code, I simply use the Bigtable Python API to query large cells.
|
It appears that passing the @mbrukman it appears to me that there is some handling of chunks on the client side. But in this case it appears to not get the chunks back yet. row_data = table.read_rows('large-row')
print(row_data.consume_next()) Traceback (most recent call last):
File "/Documents/test_bigtable.py", line 14, in <module>
print(row_data.consume_next())
File "/Documents/test/.tox/py27/lib/python2.7/site-packages/google_cloud_bigtable-0.22.0-py2.7.egg/google/cloud/bigtable/row_data.py", line 261, in consume_next
response = six.next(self._response_iterator)
File "python_build/bdist.macosx-10.12-intel/egg/grpc/_channel.py", line 344, in next
File "python_build/bdist.macosx-10.12-intel/egg/grpc/_channel.py", line 335, in _next
grpc._channel._Rendezvous: <_Rendezvous of RPC that terminated with (StatusCode.INTERNAL, Max message size exceeded)> I'm missing how to get |
I think we need some help from someone on the grpc team to solve this. |
I wasn't sure based on the previous conversation. Thanks for clearing that up for me @sduskis! Let me know if there's anything I can do to help. I'm not entirely sure who to ask about this? |
I'll ping some folks who might know how to help. |
Thanks all! So for now, it looks like this is a feature request which needs an owner? |
There's an answer in this thread to the question by the grpc python lead @nathanielmanistaatgoogle. @gamorris's answer points to the code that needs to be changed. @dhermes, are you the right person to change this? |
@dhermes: the "you" in "when you pass a channel options value" is "the caller of the channel construction functions |
Just for clarification, I tried adding the |
If no one is actively working on a patch, would anyone mind if I took a stab at it? |
@daspecster: what leads you to believe that |
@atdt By all means, take a stab. @daspecster is digging around but I'm not sure how actively. |
Thanks @nathanielmanistaatgoogle! I'll give that a try right now. |
@nathanielmanistaatgoogle @dhermes @atdt, using the correct option header seems to have worked. I'll update the tests for this and make a PR. Note: I'll add a system test so that when grpc gets updated we won't forget(at least for long) to switch it to |
Do unrecognized options get ignored? I wonder if you might be able to simply set both today and leave a note to remove the old one in the future. |
It appears they are ignored. |
@dhermes, seems @daspecster is on it -- I'll find another issue :) |
Thank you very much @dhermes and @daspecster !! Once this is merged, can you please let me know how I can use it through pip? |
@brandon-white it will be in the next release, but I'm not entirely sure when that will be. Until then, you could point pip to this commit hash, unless @dhermes or @tseaver have other ideas? |
@daspecster Thanks! The changes work but I cannot pull them with |
@brandon-white that's because the root |
@dhermes Thanks for your help here! Do you or anybody else have any idea when the next Bigtable client release might be? |
I may cut a release this week. However, I'm happy to help you get a local install working from source, feel free to ping me on Hangouts (email on GitHub profile) |
@dhermes Appreciate it Danny! I am willing to wait 1-2 weeks for the official release on pip. Thanks for your help! |
On the big table documents, it says the max cell size is 100mb. However, when I try to read a row with a cell of size 10mb using the Bigtable Python client, I get the following error:
This max size seems to be hard coded in the grpc library. Has anybody been able to read large rows using the Bigtable Python client? Any idea for workarounds or how I can set the max size?
The text was updated successfully, but these errors were encountered: