-
Notifications
You must be signed in to change notification settings - Fork 902
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEA] Struct accessor from dask-cudf #8658
Labels
Milestone
Comments
beckernick
added
feature request
New feature or request
Python
Affects Python cuDF API.
dask
Dask issue
labels
Jul 6, 2021
I'm beginning to work on this issue |
Any update here @NV-jpt? (No worries if not) |
I've created a draft pull-request for this issue! #8874 One of the tests I wrote is failing, so I expect we still need to do a bit more digging, but I think we are just about done! |
rapids-bot bot
pushed a commit
that referenced
this issue
Aug 24, 2021
This PR implements 'Struct Accessor' requested feature in dask-cudf (Issue [#8658](#8658)) StructMethod class implemented to expose 'field(key)' method in dask-cudf Examples -------- >>> s = cudf.Series([{'a': 1, 'b': 2}, {'a': 3, 'b': 4}]) >>> ds = dask_cudf.from_cudf(s, 2) >>> ds.struct.field(0).compute() 0 1 1 3 dtype: int64 >>> ds.struct.field('a').compute() 0 1 1 3 dtype: int64 Authors: - https://github.com/NV-jpt - https://github.com/shaneding Approvers: - Richard (Rick) Zamora (https://github.com/rjzamora) - Ashwin Srinath (https://github.com/shwina) URL: #8874
This was implemented in #8874 . Closing import cudf
import dask_cudf
df = cudf.DataFrame(
{"col": [
{"a":5, "b":10},
{"a":3, "b":7},
{"a":-3, "b":11}
]}
)
ddf = dask_cudf.from_cudf(df, 2)
ddf.col.struct.field("a").compute()
0 5
1 3
2 -3
Name: col, dtype: int64 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
Today, in cuDF Python I can extract individual fields from a struct column with the
struct.field(key)
API. As manipulating struct column is common in big data processing frameworks, we should support this accessor in Dask-cuDF (like we do with thelist
accessor, shown below).This is likely blocked by #8657 , based on the traceback
List accessor
The text was updated successfully, but these errors were encountered: