-
Notifications
You must be signed in to change notification settings - Fork 906
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
String-to-boolean conversion is different from Pandas #8549
Conversation
FYI, please see https://nvidia.slack.com/archives/CDTANRCTT/p1616688610161600 |
Codecov Report
@@ Coverage Diff @@
## branch-21.08 #8549 +/- ##
===============================================
Coverage ? 10.61%
===============================================
Files ? 109
Lines ? 18645
Branches ? 0
===============================================
Hits ? 1980
Misses ? 16665
Partials ? 0 Continue to review full report at Codecov.
|
@gpucibot re-run tests |
@gpucibot merge |
Fixes: #7875
Previously: Pandas treats all non-empty strings as true values when it converts strings to booleans, whereas cuDF accepts only those that match with the true string (which is
True
by default).This PR resolves the mismatch by introducing the
str_to_boolean
method, which filters a string column to check if len(StringColumn)> 0 and replacesNAN
values withFalse
to mimick Pandas behaviorExample: