-
Notifications
You must be signed in to change notification settings - Fork 902
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add decimal128 support to Parquet reader and writer #9765
Conversation
Codecov Report
@@ Coverage Diff @@
## branch-22.02 #9765 +/- ##
================================================
- Coverage 10.49% 10.43% -0.06%
================================================
Files 119 119
Lines 20305 20449 +144
================================================
+ Hits 2130 2134 +4
- Misses 18175 18315 +140
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for accommodating the changes. LGTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great. I felt the need to add one change, just to be annoying.
Co-authored-by: nvdbaranec <[email protected]>
This PR adds a `decimal128` type validation in parquet reader. This is put in-place to unblock libcudf changes: #9765 and this validation will soon be removed once python side of `decimal128` changes are merged(blocked by libcudf `from_arrow` bug). Authors: - GALI PREM SAGAR (https://github.com/galipremsagar) Approvers: - Vukasin Milovanovic (https://github.com/vuule) - Vyas Ramasubramani (https://github.com/vyasr) URL: #9804
@gpucibot merge |
Closes #9566
Depends on #9804
Read decimal columns as 128bit when the input width requires it.
Write decimal128 columns as
FIXED_LEN_BYTE_ARRAY
.Use the smallest viable decimal size to read
FIXED_LEN_BYTE_ARRAY
(used to default to decimal64, even when 32bits are sufficient).Removes
strict_decimal_types
option from Parquet reader, we can now always read using the exact decimal type.