Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement lists::concatenate_list_elements #8231

Merged

Conversation

ttnghia
Copy link
Contributor

@ttnghia ttnghia commented May 12, 2021

This PR implements lists::concatenate_list_elements for list type. Given a lists column in which each row is a list of lists, the output column is generated by concatenating all lists in the same row into a single list.

Example:

l = [ [{1, 2}, {3, 4}, {5}], [{6}, {}, {7, 8, 9}] ]
r = lists::concatenate_list_elements(l);
r is [ {1, 2, 3, 4, 5}, {6, 7, 8, 9} ]

This closes #8164. In addition, lists::concatenate_rows is rewritten using lists::interleave_columns following by lists::concatenate_list_elements, which is significantly shorter.

@ttnghia ttnghia added feature request New feature or request 3 - Ready for Review Ready for review by team libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change labels May 12, 2021
@ttnghia ttnghia self-assigned this May 12, 2021
@ttnghia ttnghia requested review from a team as code owners May 12, 2021 22:56
@ttnghia ttnghia requested a review from a team as a code owner May 12, 2021 22:56
@github-actions github-actions bot added CMake CMake build issue conda labels May 12, 2021
@ttnghia
Copy link
Contributor Author

ttnghia commented May 12, 2021

Reference: #8006

@codecov
Copy link

codecov bot commented May 13, 2021

Codecov Report

❗ No coverage uploaded for pull request base (branch-21.06@414e9bb). Click here to learn what that means.
The diff coverage is n/a.

Impacted file tree graph

@@               Coverage Diff               @@
##             branch-21.06    #8231   +/-   ##
===============================================
  Coverage                ?   82.84%           
===============================================
  Files                   ?      105           
  Lines                   ?    17865           
  Branches                ?        0           
===============================================
  Hits                    ?    14800           
  Misses                  ?     3065           
  Partials                ?        0           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 414e9bb...0bb3be8. Read the comment docs.

@github-actions github-actions bot added Java Affects Java cuDF API. Python Affects Python cuDF API. labels May 13, 2021
@ttnghia ttnghia force-pushed the lists_concatenate_list_elements branch from bac62a1 to afce24e Compare May 13, 2021 02:47
@ttnghia ttnghia requested a review from a team as a code owner May 13, 2021 02:47
@ttnghia ttnghia requested a review from a team as a code owner May 13, 2021 02:47
@ttnghia ttnghia requested a review from shwina May 13, 2021 02:47
Copy link
Contributor

@robertmaynard robertmaynard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving CMake changes.

Copy link
Member

@jlowe jlowe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Java approval

Copy link
Member

@ajschmidt8 ajschmidt8 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving ops-codeowner file changes

Copy link
Contributor

@devavret devavret left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code is giving me major déjà vu. Can you try to consolidate this with lists::concatinate_rows? There seems to be lot of shared logic. Even better, it looks like the entirety of lists::concatinate_rows can be reimplemented as an interleave_columns followed by a concatenate_list_elements.

cpp/src/lists/combine/concatenate_list_elements.cu Outdated Show resolved Hide resolved
@ttnghia ttnghia requested a review from devavret May 18, 2021 22:39
@@ -5,7 +5,7 @@ from libcpp.memory cimport unique_ptr
from cudf._lib.cpp.column.column cimport column
from cudf._lib.cpp.table.table_view cimport table_view

cdef extern from "cudf/lists/concatenate_rows.hpp" namespace \
cdef extern from "cudf/lists/combine.hpp" namespace \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file would likely have to be renamed too to combine.pxd

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Done.

Copy link
Contributor

@devavret devavret left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This much better

@ttnghia
Copy link
Contributor Author

ttnghia commented May 20, 2021

Rerun tests.

@ttnghia
Copy link
Contributor Author

ttnghia commented May 20, 2021

@gpucibot merge

@rapids-bot rapids-bot bot merged commit 944e932 into rapidsai:branch-21.06 May 20, 2021
@ttnghia ttnghia deleted the lists_concatenate_list_elements branch May 21, 2021 22:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 - Ready for Review Ready for review by team CMake CMake build issue feature request New feature or request Java Affects Java cuDF API. libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change Python Affects Python cuDF API.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEA] Support concatenate_list_elements for list type
6 participants