-
Notifications
You must be signed in to change notification settings - Fork 902
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEA] Support exploding nested type columns #2975
Labels
feature request
New feature or request
libcudf
Affects libcudf (C++/CUDA) code.
Python
Affects Python cuDF API.
Spark
Functionality that helps Spark RAPIDS
Comments
beckernick
added
feature request
New feature or request
libcudf
Affects libcudf (C++/CUDA) code.
labels
Oct 4, 2019
beckernick
changed the title
[FEA] Support exploding nested type column
[FEA] Support exploding nested type columns
Oct 4, 2019
This was referenced Aug 18, 2020
kkraus14
added
Python
Affects Python cuDF API.
Spark
Functionality that helps Spark RAPIDS
labels
Feb 25, 2021
Relevant libcudf PR that implemented the functionality: https://github.com/rapidsai/cudf/pull/7140/files |
Looks like Marlene is already working on this issue: |
rapids-bot bot
pushed a commit
that referenced
this issue
Mar 18, 2021
Closes #2975 This PR introduces `explode` API, which flattens list columns and turns list elements into rows. Example: ```python >>> s = cudf.Series([[1, 2, 3], [], None, [4, 5]]) >>> s 0 [1, 2, 3] 1 [] 2 None 3 [4, 5] dtype: list >>> s.explode() 0 1 0 2 0 3 1 <NA> 2 <NA> 3 4 3 5 dtype: int64 ``` Supersedes #7538 Authors: - Michael Wang (@isVoid) Approvers: - Keith Kraus (@kkraus14) - GALI PREM SAGAR (@galipremsagar) URL: #7607
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
feature request
New feature or request
libcudf
Affects libcudf (C++/CUDA) code.
Python
Affects Python cuDF API.
Spark
Functionality that helps Spark RAPIDS
When processing a nested type column, I'd like to be able to
explode
the column into a non-nested type column, like in Spark-sql or pandas. Spark API doc.Pyspark:
Pandas:
The text was updated successfully, but these errors were encountered: