-
Notifications
You must be signed in to change notification settings - Fork 170
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CHORE] Ensure compatibility with deltalake version v0.19 #2827
Conversation
CodSpeed Performance ReportMerging #2827 will degrade performances by 56.51%Comparing Summary
Benchmarks breakdown
|
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #2827 +/- ##
==========================================
+ Coverage 63.37% 64.04% +0.66%
==========================================
Files 1010 1007 -3
Lines 114165 112948 -1217
==========================================
- Hits 72355 72334 -21
+ Misses 41810 40614 -1196
|
I face this
with deltalake 0.19.2 |
Hi @fecet, would you mind creating a Github issue for this? Also, could you check if this worked in the current release version of Daft with deltalake 0.18.2? It would also help to see more details about the query that you are trying to make. |
…nc#2827) Deltalake v0.19 changes their `_convert_pa_schema_to_delta` function to take in a `schema_conversion_mode` instead of `large_dtypes`. Pyarrow also needed to be upgraded to v16.0.0 as well to be compatible with the new version of deltalake The difference between this PR and Eventual-Inc#2754 is that it still maintains compatibility with older deltalake versions. The reason why this PR also includes a change to arrow2 is because starting in version 0.19, deltalake uses arrow-rs by default instead of pyarrow to write files when calling `deltalake.write_deltalake`. We do not actually use this functionality but our tests do, and arrow-rs writes map arrays in a way that does not conform to the parquet spec. I figured it would be good to just add that compatibility in there just in case some user is using `arrow-rs` to write their parquet files. However, there are also other issues with deltalake's rust writer, including improper encoding of partitioned binary columns, so we will use their pyarrow writer for testing.
…nc#2827) Deltalake v0.19 changes their `_convert_pa_schema_to_delta` function to take in a `schema_conversion_mode` instead of `large_dtypes`. Pyarrow also needed to be upgraded to v16.0.0 as well to be compatible with the new version of deltalake The difference between this PR and Eventual-Inc#2754 is that it still maintains compatibility with older deltalake versions. The reason why this PR also includes a change to arrow2 is because starting in version 0.19, deltalake uses arrow-rs by default instead of pyarrow to write files when calling `deltalake.write_deltalake`. We do not actually use this functionality but our tests do, and arrow-rs writes map arrays in a way that does not conform to the parquet spec. I figured it would be good to just add that compatibility in there just in case some user is using `arrow-rs` to write their parquet files. However, there are also other issues with deltalake's rust writer, including improper encoding of partitioned binary columns, so we will use their pyarrow writer for testing.
Deltalake v0.19 changes their
_convert_pa_schema_to_delta
function to take in aschema_conversion_mode
instead oflarge_dtypes
. Pyarrow also needed to be upgraded to v16.0.0 as well to be compatible with the new version of deltalakeThe difference between this PR and #2754 is that it still maintains compatibility with older deltalake versions.
The reason why this PR also includes a change to arrow2 is because starting in version 0.19, deltalake uses arrow-rs by default instead of pyarrow to write files when calling
deltalake.write_deltalake
. We do not actually use this functionality but our tests do, and arrow-rs writes map arrays in a way that does not conform to the parquet spec. I figured it would be good to just add that compatibility in there just in case some user is usingarrow-rs
to write their parquet files.However, there are also other issues with deltalake's rust writer, including improper encoding of partitioned binary columns, so we will use their pyarrow writer for testing.