Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove make strings children with null mask #8830

Conversation

davidwendt
Copy link
Contributor

Closes #8580

The cudf::strings::detail::make_strings_children_with_null_mask utility was created temporarily to help build the output column validities bitmask for join_lists_elements (for strings) and lists::interleave_columns (for strings). But it used a temporary int8_t device vector to hold single-bit values. It would then convert the int8 column into a bitmask with a kernel call. This PR removes the utility in favor of executing a kernel using the cudf::detail::valid_if utility to build the bitmask directly without requiring a temporary buffer. Removing the temporary buffer from the join_list_elements strings API was not difficult. The temporary buffer is still used in the lists::interleave_columns for now.

A follow on PR should change this to utilize the output bitmask and directly set the bits rather than using a temporary int8 buffer that gets converted to a bitmask. This approach could also be used in the join_lists_element to ultimately avoid the valid_if call.

Removing this utility simplifies the code a bit and should speed up compiling any source file that includes cudf/strings/detail/utilities.cuh (~160 files right now).

@davidwendt davidwendt added 2 - In Progress Currently a work in progress libcudf Affects libcudf (C++/CUDA) code. improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Jul 22, 2021
@davidwendt davidwendt self-assigned this Jul 22, 2021
@codecov
Copy link

codecov bot commented Jul 22, 2021

Codecov Report

Merging #8830 (df97ca4) into branch-21.10 (18f7c01) will decrease coverage by 0.08%.
The diff coverage is n/a.

❗ Current head df97ca4 differs from pull request most recent head 9403ae8. Consider uploading reports for the commit 9403ae8 to get more accurate results
Impacted file tree graph

@@               Coverage Diff                @@
##           branch-21.10    #8830      +/-   ##
================================================
- Coverage         10.67%   10.59%   -0.09%     
================================================
  Files               110      116       +6     
  Lines             18271    19037     +766     
================================================
+ Hits               1951     2017      +66     
- Misses            16320    17020     +700     
Impacted Files Coverage Δ
python/cudf/cudf/__init__.py 0.00% <ø> (ø)
python/cudf/cudf/core/__init__.py 0.00% <ø> (ø)
python/cudf/cudf/core/column/categorical.py 0.00% <ø> (ø)
python/cudf/cudf/core/column/column.py 0.00% <ø> (ø)
python/cudf/cudf/core/column/lists.py 0.00% <ø> (ø)
python/cudf/cudf/core/column/numerical.py 0.00% <ø> (ø)
python/cudf/cudf/core/column/string.py 0.00% <ø> (ø)
python/cudf/cudf/core/column/struct.py 0.00% <ø> (ø)
python/cudf/cudf/core/dataframe.py 0.00% <ø> (ø)
python/cudf/cudf/core/frame.py 0.00% <ø> (ø)
... and 73 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 904222b...9403ae8. Read the comment docs.

@davidwendt davidwendt added 3 - Ready for Review Ready for review by team and removed 2 - In Progress Currently a work in progress labels Jul 27, 2021
@davidwendt davidwendt marked this pull request as ready for review July 27, 2021 18:25
@davidwendt davidwendt requested a review from a team as a code owner July 27, 2021 18:25
Copy link
Contributor

@codereport codereport left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm 👍

@davidwendt
Copy link
Contributor Author

@gpucibot merge

@rapids-bot rapids-bot bot merged commit 2c4d984 into rapidsai:branch-21.10 Aug 4, 2021
@davidwendt davidwendt deleted the remove-make-strings-children-with-null-mask branch August 4, 2021 20:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 - Ready for Review Ready for review by team improvement Improvement / enhancement to an existing function libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEA] Remove make_strings_children_with_null_mask utility
3 participants