-
Notifications
You must be signed in to change notification settings - Fork 902
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change strings copy_if_else to use optional-iterator instead of pair-iterator #9266
Change strings copy_if_else to use optional-iterator instead of pair-iterator #9266
Conversation
rerun tests |
Codecov Report
@@ Coverage Diff @@
## branch-21.12 #9266 +/- ##
================================================
- Coverage 10.79% 10.78% -0.01%
================================================
Files 116 116
Lines 18869 19438 +569
================================================
+ Hits 2036 2096 +60
- Misses 16833 17342 +509
Continue to review full report at Codecov.
|
@@ -26,6 +26,7 @@ | |||
#include <cudf/utilities/type_dispatcher.hpp> | |||
|
|||
#include <rmm/cuda_stream_view.hpp> | |||
#include <rmm/exec_policy.hpp> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm curious why this change was necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was previously included in the include file copy_if_else.cuh
which did not use it. Cleaning up the headers there required moving it here where it is actually used.
@gpucibot merge |
This PR changes the `cudf::detail::copy_if_else` utility functions to accept an optional-iterator instead of a pair-iterator. This improves the compile time of source files by generating 4x less kernels since the two input data arrays can each have nulls requiring 4 different pair-iterators to be created to call it. The optional-iterator allows the nulls check to occur at runtime instead of compile time. The changes in this PR are for the callers of `detail::copy_if_else` to provide optional-iterators instead of pair-iterators. This PR is dependent on the changes in #9306 The benchmarks for the effected calling functions showed no significant change in runtime performance using the single optional-iterator over 4 unique pair-iterators. Two additional benchmarks are included cover non-null measurement which this PR impacts the most. Also, the `copy_tests.cu` was renamed `copy_tests.cpp` and the test that launched to the internal `cudf::detail::copy_if_else_kernel` was replaced with one with a data-set large enough to require multiple blocks. Related changes for the strings specific `cudf::strings::detail::copy_if_else` are in #9266 Authors: - David Wendt (https://github.com/davidwendt) Approvers: - Bradley Dice (https://github.com/bdice) - Robert Maynard (https://github.com/robertmaynard) - Jake Hemstad (https://github.com/jrhemstad) URL: #9324
This part of bigger change happening for
copy.cu
and specificallycopy_if_else.cuh
changing those to usemake_optional_iterator
instead ofmake_pair_iterator
. This is a piece of that overall change that could be separated into a smaller PR. The main effect here is to reduce the compile size and time ofsegmented_shift.cu
.No behavior has changed and the benchmark results for
cudf::shift
has also not changed.