-
Notifications
You must be signed in to change notification settings - Fork 902
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement interleave_columns
for lists with arbitrary nested type
#9130
Implement interleave_columns
for lists with arbitrary nested type
#9130
Conversation
Codecov Report
@@ Coverage Diff @@
## branch-21.10 #9130 +/- ##
===============================================
Coverage ? 10.83%
===============================================
Files ? 114
Lines ? 19101
Branches ? 0
===============================================
Hits ? 2070
Misses ? 17031
Partials ? 0 Continue to review full report at Codecov.
|
Clever implementation to make use of the fact that |
Learnt some advanced skills on how to use cuDF developer API (like gather) from this PR :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great.
CUDF_EXPECTS(entry_type == child_col.type(), | ||
"The types of entries in the input columns must be the same."); | ||
} | ||
|
||
if (input.num_rows() == 0) { return cudf::empty_like(input.column(0)); } | ||
if (input.num_columns() == 1) { return std::make_unique<column>(*(input.begin()), stream, mr); } | ||
|
||
// For nested types, we rely on the `concatenate_and_gather` method, which costs more memory due |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would be another good candidate for a hypothetical multi_gather function if we had it. Each entry in the gather map would be a pair {column_index, row_index} selecting rows from different columns in an input table.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Dave. The idea is great. I have filed an issue so we can track it here: #9175
@gpucibot merge |
This PR adds more support for lists column in the
interleave_column
API. In particular, it adds nested types support for the list entries. As such, now we can callinterleave_column
on a lists column of any type such as lists of structs, lists of lists, lists of lists of lists and so on. In addition, this PR also does a simple refactor of the existing overload functions with a new style of SFINAE implementation.Closes #9106.