Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix cudf::hash_partition for decimal32 and decimal64 #7006

Merged
merged 3 commits into from
Dec 15, 2020

Conversation

codereport
Copy link
Contributor

@codereport codereport commented Dec 14, 2020

This resolves #6996

@codereport codereport added bug Something isn't working libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change labels Dec 14, 2020
@codereport codereport self-assigned this Dec 14, 2020
@revans2
Copy link
Contributor

revans2 commented Dec 14, 2020

I ran with this patch and it fixed the issues I was seeing in Spark.

@codereport
Copy link
Contributor Author

I ran with this patch and it fixed the issues I was seeing in Spark.

👍 I am just putting together another unit test for cudf::partition and then I will mark the PR as ready for review.

@codereport codereport marked this pull request as ready for review December 15, 2020 00:14
@codereport codereport requested a review from a team as a code owner December 15, 2020 00:14
@codereport codereport added 3 - Ready for Review Ready for review by team 4 - Needs Review Waiting for reviewer to review or respond labels Dec 15, 2020
@codecov
Copy link

codecov bot commented Dec 15, 2020

Codecov Report

Merging #7006 (eaa49a3) into branch-0.18 (929c3f4) will increase coverage by 0.00%.
The diff coverage is n/a.

Impacted file tree graph

@@             Coverage Diff              @@
##           branch-0.18    #7006   +/-   ##
============================================
  Coverage        82.01%   82.01%           
============================================
  Files               96       96           
  Lines            16338    16340    +2     
============================================
+ Hits             13400    13402    +2     
  Misses            2938     2938           
Impacted Files Coverage Δ
python/cudf/cudf/io/dlpack.py 95.23% <0.00%> (ø)
python/cudf/cudf/core/column/numerical.py 94.56% <0.00%> (+0.02%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 929c3f4...eaa49a3. Read the comment docs.

Copy link
Contributor

@hyperbolic2346 hyperbolic2346 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder how much more of this is around. This is a subtle difference and easy to miss.

@codereport
Copy link
Contributor Author

I wonder how much more of this is around. This is a subtle difference and easy to miss.

The fix is subtle but the API (cudf::hash_partition) would completely break when used. Most APIs had fixed_width tests that the original fixed_point column support was able to piggy back on top of to test that it worked. However, for APIs like this one that didn't, it's possible that it is broken and then we will find out once it is used.

It is on my list of things to do to go through every API at some point and make sure there are tests.

@harrism harrism added 6 - Okay to Auto-Merge and removed 3 - Ready for Review Ready for review by team 4 - Needs Review Waiting for reviewer to review or respond labels Dec 15, 2020
@rapids-bot rapids-bot bot merged commit b370963 into rapidsai:branch-0.18 Dec 15, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] hash_partition silently corrupts fixed_point data
5 participants