Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REVIEW] Handle nan values correctly in Series.one_hot_encoding #7059

Merged
merged 1 commit into from
Jan 7, 2021

Conversation

galipremsagar
Copy link
Contributor

Fixes: #7056

This PR handles nan values separately in one_hot_encoding when the given input category is None. Previously we were combining both nan & <NA> values to be the same when cat is None.

@galipremsagar galipremsagar added bug Something isn't working 3 - Ready for Review Ready for review by team Python Affects Python cuDF API. 4 - Needs cuDF (Python) Reviewer labels Dec 31, 2020
@galipremsagar galipremsagar requested a review from a team as a code owner December 31, 2020 04:58
@galipremsagar galipremsagar self-assigned this Dec 31, 2020
@galipremsagar galipremsagar added the non-breaking Non-breaking change label Dec 31, 2020
@codecov
Copy link

codecov bot commented Dec 31, 2020

Codecov Report

Merging #7059 (484074a) into branch-0.18 (28d18d6) will increase coverage by 0.01%.
The diff coverage is 66.66%.

Impacted file tree graph

@@               Coverage Diff               @@
##           branch-0.18    #7059      +/-   ##
===============================================
+ Coverage        82.09%   82.11%   +0.01%     
===============================================
  Files               97       97              
  Lines            16477    16482       +5     
===============================================
+ Hits             13527    13534       +7     
+ Misses            2950     2948       -2     
Impacted Files Coverage Δ
python/cudf/cudf/core/series.py 91.08% <66.66%> (-0.07%) ⬇️
python/cudf/cudf/_fuzz_testing/fuzzer.py 0.00% <0.00%> (ø)
python/cudf/cudf/utils/hash_vocab_utils.py 100.00% <0.00%> (ø)
python/cudf/cudf/core/abc.py 91.48% <0.00%> (+4.25%) ⬆️
python/cudf/cudf/utils/gpu_utils.py 58.53% <0.00%> (+4.87%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 28d18d6...484074a. Read the comment docs.

@kkraus14 kkraus14 added 5 - Ready to Merge Testing and reviews complete, ready to merge 6 - Okay to Auto-Merge and removed 3 - Ready for Review Ready for review by team 4 - Needs cuDF (Python) Reviewer labels Jan 7, 2021
@rapids-bot rapids-bot bot merged commit 9439ed8 into rapidsai:branch-0.18 Jan 7, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
5 - Ready to Merge Testing and reviews complete, ready to merge bug Something isn't working non-breaking Non-breaking change Python Affects Python cuDF API.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] one_hot_encoding is not handling correctly for cases when data has nan values
2 participants