-
Notifications
You must be signed in to change notification settings - Fork 413
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
to_pyarrow_table()
on a table in S3 kept getting "Generic S3 error: error decoding response body"
#2595
Comments
@k-ye can you try deltalake 0.19.1 please and report back if you see improvements |
I've done a few tests:
|
Hmm that's a C++ error Regarding VPN's, I would defer to what Tustvold said: apache/arrow-rs#5882 (comment) |
I have the same issue, |
Hi @shriram-louisa , Sorry I didn't follow up on this. We were just testing the water when I filed this issue, and have sinced moved on to other solutions... |
Thank you for the response 🙌 @k-ye |
What delta-rs version are you using? |
|
I am getting the same error. Do know what causes the error? delta-rs 0.19.1 |
@shriram-louisa @sim-san how stable is your connection, are you running this behind a VPN, what's the throughput, how big is your table? I need more info guys |
Have any of you tired increasing the timeout to 60s or 120s? |
The delta table has a file size of around 700MB. The increase of the timeout solved the problem. |
You will have to check the object_store crate documentation, and then choose your storage aws S3, it shows the config keys |
This seems to also happen when querying a longer data column, probably with same cause where during operation, the network connection is interrupted.
|
We've been having some similar-ish problems recently on 0.19.0 with Azure. I thought it might be interesting to mention because in our case we only use the object store based filesystem for parsing the delta transaction log and we use the native Azure filesystem built into It looks like the original issue report was having issues when I'm going to try upgrading us to the latest delta-rs and hopefully #2789 will solve it. |
Environment
Delta-rs version:
deltalake==0.18.1
Binding: Python
Environment:
pyarrow==16.1.0
pyarrow-hotfix==0.6
Bug
What happened:
Trying to do a simple table loading from S3, but kept getting this
OSError: Generic S3 error: error decoding response body
Stack shows that this is actually in
pyarrow
. Not sure if it possible to tweakpyarrow
's behavior with S3 fromdeltalake
.What you expected to happen:
I can get the pyarrow table.
How to reproduce it:
More details:
I have verified the integrity of this table with these methods:
to_pyarrow_table()
runs fine.duckdb
(and itsdelta
extension). Worked fine, too.The text was updated successfully, but these errors were encountered: