-
-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compression errors with feather/parquet export (lz4/zstd) #2018
Comments
Could you open this issue upstream in arrow2? The parquet and IPC implementations are from arrow2. |
Oh, sorry, I missed this one. Looking into it |
wrt to the parquet, I concluded that the likely cause is on pyarrow itself, as parquet files written by pyarrow with LZ4 and ZSTD compression are un-readable by (py)spark. Filled bug upstream https://issues.apache.org/jira/browse/ARROW-15073 Will now look into the feather one. |
Many thanks all - I'll try all other possible permutations and -for now- hide compression options that aren't bidirectionally successful on our end, and report any other interesting results (upstream, as requested ;). |
Versions
Python 3.9 / Polars 0.10.27 / Windows 10
Describe your bug.
Exports to compressed feather/parquet cannot be read back if
use_pyarrow=True
(succeed only ifuse_pyarrow=False
).Errors include:
What are the steps to reproduce the behavior?
What is the actual behavior?
What is the expected behavior?
Successful load into DataFrame for round-trip import/export to compressed feather/parquet from Polars with default settings.
The text was updated successfully, but these errors were encountered: