-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add methods for the Arrow PyCapsule Protocol to DataFrame/Column interchange protocol objects #342
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -350,6 +350,41 @@ def get_buffers(self) -> ColumnBuffers: | |
""" | ||
pass | ||
|
||
def __arrow_c_schema__(self) -> object: | ||
""" | ||
Export the data type of the Column to a Arrow C schema PyCapsule. | ||
|
||
Returns | ||
------- | ||
PyCapsule | ||
""" | ||
pass | ||
|
||
def __arrow_c_array__( | ||
self, requested_schema: Optional[object] = None | ||
) -> Tuple[object, object]: | ||
""" | ||
Export the Column as an Arrow C array and schema PyCapsule. | ||
|
||
If the Column consists of multiple chunks, this method should raise | ||
an error. | ||
|
||
Parameters | ||
---------- | ||
requested_schema : PyCapsule, default None | ||
The schema to which the dataframe should be casted, passed as a | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit: "casted" -> "cast" (also further down) |
||
PyCapsule containing a C ArrowSchema representation of the | ||
requested schema. | ||
If None, the array will be returned as-is, with a type matching the | ||
one returned by ``__arrow_c_schema__()``. | ||
|
||
Returns | ||
------- | ||
Tuple[PyCapsule, PyCapsule] | ||
A pair of PyCapsules containing a C ArrowSchema and ArrowArray, | ||
respectively. | ||
""" | ||
pass | ||
|
||
# def get_children(self) -> Iterable[Column]: | ||
# """ | ||
|
@@ -490,3 +525,32 @@ def get_chunks(self, n_chunks: Optional[int] = None) -> Iterable["DataFrame"]: | |
same way. | ||
""" | ||
pass | ||
|
||
def __arrow_c_schema__(self) -> object: | ||
""" | ||
Export the schema of the DataFrae to a Arrow C schema PyCapsule. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. typo in DataFrame |
||
|
||
Returns | ||
------- | ||
PyCapsule | ||
""" | ||
pass | ||
|
||
def __arrow_c_stream__(self, requested_schema: Optional[object] = None) -> object: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yet more conflicting terminology? Is "stream" supposed to mean "dataframe" here, rather than CUDA stream? If so, won't that conflict with device support later, and/or confused with DLPack stream support? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How does one differentiate between a DataFrame and a struct column here (assuming those will be supported in the future)? |
||
""" | ||
Export the DataFrame as an Arrow C stream PyCapsule. | ||
|
||
Parameters | ||
---------- | ||
requested_schema : PyCapsule, default None | ||
The schema to which the dataframe should be casted, passed as a | ||
PyCapsule containing a C ArrowSchema representation of the | ||
requested schema. | ||
If None, the array will be returned as-is, with a type matching the | ||
one returned by ``__arrow_c_schema__()``. | ||
|
||
Returns | ||
------- | ||
PyCapsule | ||
""" | ||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this not supported at all? Or if this is supported only at the dataframe level (since the same restriction isn't mentioned there), should this say why and/or refer to that support at dataframe level?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be useful to add a section: