-
Notifications
You must be signed in to change notification settings - Fork 908
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Convert cudf::concatenate APIs to use spans and device_uvector #7621
Convert cudf::concatenate APIs to use spans and device_uvector #7621
Conversation
Codecov Report
@@ Coverage Diff @@
## branch-0.19 #7621 +/- ##
===============================================
+ Coverage 81.86% 82.57% +0.70%
===============================================
Files 101 101
Lines 16884 17551 +667
===============================================
+ Hits 13822 14492 +670
+ Misses 3062 3059 -3
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few very minor suggestions.
This change helps more than I expected. Concatenate is significantly faster now, except for concatenating strings columns, where the change seems to be noise.
|
@harrism Does the perf chart show relative difference? |
@vuule yes they are relative. The computation is With my latest commit I have also replaced
|
… them with vectors
# these don't exist in the C++, but construction from vector is possible since | ||
# a host_span can be implicitly constructed from a vector, so we add these for convenience | ||
cdef device_buffer concatenate_masks "cudf::concatenate_masks"( | ||
vector[column_view] views | ||
) except + | ||
cdef unique_ptr[column] concatenate_columns "cudf::concatenate"( | ||
const vector[column_view] columns | ||
vector[column_view] columns | ||
) except + | ||
cdef unique_ptr[table] concatenate_tables "cudf::concatenate"( | ||
const vector[table_view] tables | ||
vector[table_view] tables | ||
) except + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shwina these are the same as the original Cython code for concatenate, and the only ones currently used. So one idea is that I could revert my Cython changes in this PR. We could then wait and add host_span
in Cython only if we need it... What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Cython definitions are nice to have. I suspect at some point we're going to need them, so let's not delete. Agree about keeping the concat
definitions the same though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cython LGTM!
@gpucibot merge |
Contributes to #7287
This PR replaces
std::vector
withhost_span
in public and detailcudf::contatenate
functions, and replacesrmm::device_vector
withrmm::device_uvector
in the concatenate implementations.It also strengthens the SFINAE restrictions on
cudf::host_span
andcudf::device_span
so that they cannot be constructed from containers unless the container's value_type is the same as the span's value_type.This PR also