Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change strings copy_if_else to use optional-iterator instead of pair-iterator #9266

Merged

Conversation

davidwendt
Copy link
Contributor

This part of bigger change happening for copy.cu and specifically copy_if_else.cuh changing those to use make_optional_iterator instead of make_pair_iterator. This is a piece of that overall change that could be separated into a smaller PR. The main effect here is to reduce the compile size and time of segmented_shift.cu.

No behavior has changed and the benchmark results for cudf::shift has also not changed.

@davidwendt davidwendt added 2 - In Progress Currently a work in progress libcudf Affects libcudf (C++/CUDA) code. improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Sep 21, 2021
@davidwendt davidwendt self-assigned this Sep 21, 2021
@davidwendt
Copy link
Contributor Author

rerun tests

@codecov
Copy link

codecov bot commented Sep 22, 2021

Codecov Report

Merging #9266 (85286ae) into branch-21.12 (ab4bfaa) will decrease coverage by 0.00%.
The diff coverage is 2.00%.

Impacted file tree graph

@@               Coverage Diff                @@
##           branch-21.12    #9266      +/-   ##
================================================
- Coverage         10.79%   10.78%   -0.01%     
================================================
  Files               116      116              
  Lines             18869    19438     +569     
================================================
+ Hits               2036     2096      +60     
- Misses            16833    17342     +509     
Impacted Files Coverage Δ
python/cudf/cudf/core/_base_index.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/column/categorical.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/column/column.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/column/datetime.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/column/numerical.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/column/string.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/column/timedelta.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/dataframe.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/frame.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/index.py 0.00% <0.00%> (ø)
... and 73 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d68e626...85286ae. Read the comment docs.

@davidwendt davidwendt added 3 - Ready for Review Ready for review by team and removed 2 - In Progress Currently a work in progress labels Sep 22, 2021
@davidwendt davidwendt marked this pull request as ready for review September 22, 2021 13:34
@davidwendt davidwendt requested a review from a team as a code owner September 22, 2021 13:34
cpp/src/copying/copy.cu Show resolved Hide resolved
@@ -26,6 +26,7 @@
#include <cudf/utilities/type_dispatcher.hpp>

#include <rmm/cuda_stream_view.hpp>
#include <rmm/exec_policy.hpp>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious why this change was necessary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was previously included in the include file copy_if_else.cuh which did not use it. Cleaning up the headers there required moving it here where it is actually used.

@davidwendt
Copy link
Contributor Author

@gpucibot merge

@rapids-bot rapids-bot bot merged commit a743ce8 into rapidsai:branch-21.12 Oct 6, 2021
@davidwendt davidwendt deleted the strings-pair-to-optional-iter branch October 6, 2021 18:17
rapids-bot bot pushed a commit that referenced this pull request Oct 13, 2021
This PR changes the `cudf::detail::copy_if_else` utility functions to accept an optional-iterator instead of a pair-iterator. This improves the compile time of source files by generating 4x less kernels since the two input data arrays can each have nulls requiring 4 different pair-iterators to be created to call it. The optional-iterator allows the nulls check to occur at runtime instead of compile time.

The changes in this PR are for the callers of `detail::copy_if_else` to provide optional-iterators instead of pair-iterators. This PR is dependent on the changes in #9306 

The benchmarks for the effected calling functions showed no significant change in runtime performance using the single optional-iterator over 4 unique pair-iterators. Two additional benchmarks are included cover non-null measurement which this PR impacts the most.

Also, the `copy_tests.cu` was renamed `copy_tests.cpp` and the test that launched to the internal  `cudf::detail::copy_if_else_kernel` was replaced with one with a data-set large enough to require multiple blocks.

Related changes for the strings specific `cudf::strings::detail::copy_if_else` are in #9266

Authors:
  - David Wendt (https://github.com/davidwendt)

Approvers:
  - Bradley Dice (https://github.com/bdice)
  - Robert Maynard (https://github.com/robertmaynard)
  - Jake Hemstad (https://github.com/jrhemstad)

URL: #9324
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 - Ready for Review Ready for review by team improvement Improvement / enhancement to an existing function libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants