-
Notifications
You must be signed in to change notification settings - Fork 902
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove make strings children with null mask #8830
Remove make strings children with null mask #8830
Conversation
Codecov Report
@@ Coverage Diff @@
## branch-21.10 #8830 +/- ##
================================================
- Coverage 10.67% 10.59% -0.09%
================================================
Files 110 116 +6
Lines 18271 19037 +766
================================================
+ Hits 1951 2017 +66
- Misses 16320 17020 +700
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm 👍
@gpucibot merge |
Closes #8580
The
cudf::strings::detail::make_strings_children_with_null_mask
utility was created temporarily to help build the output column validities bitmask forjoin_lists_elements
(for strings) andlists::interleave_columns
(for strings). But it used a temporaryint8_t
device vector to hold single-bit values. It would then convert theint8
column into a bitmask with a kernel call. This PR removes the utility in favor of executing a kernel using thecudf::detail::valid_if
utility to build the bitmask directly without requiring a temporary buffer. Removing the temporary buffer from thejoin_list_elements
strings API was not difficult. The temporary buffer is still used in thelists::interleave_columns
for now.A follow on PR should change this to utilize the output bitmask and directly set the bits rather than using a temporary
int8
buffer that gets converted to a bitmask. This approach could also be used in thejoin_lists_element
to ultimately avoid thevalid_if
call.Removing this utility simplifies the code a bit and should speed up compiling any source file that includes
cudf/strings/detail/utilities.cuh
(~160 files right now).