Skip to content

Commit

Permalink
move additional_metadata member.
Browse files Browse the repository at this point in the history
  • Loading branch information
zeroshade committed Nov 5, 2024
1 parent 0e3574b commit 5aa29a8
Show file tree
Hide file tree
Showing 2 changed files with 23 additions and 17 deletions.
19 changes: 9 additions & 10 deletions cpp/src/arrow/c/abi.h
Original file line number Diff line number Diff line change
Expand Up @@ -319,6 +319,11 @@ struct ArrowAsyncProducer {
// on_error callback on the async stream handler.
void (*cancel)(struct ArrowAsyncProducer* self);

// Any additional metadata tied to a specific stream of data. This must either be NULL
// or a valid pointer to metadata which is encoded in the same way schema metadata
// would be. Non-null metadata must be valid for the lifetime of this object.
const char* additional_metadata;

// producer-specific opaque data.
void* private_data;
};
Expand All @@ -341,22 +346,16 @@ struct ArrowAsyncDeviceStreamHandler {
// function and thus the producer is responsible for cleaning it up when calling
// the release callback of this handler.
//
// The addl_metadata argument can be null or can be used by a producer
// to pass arbitrary extra information to the consumer beyond the metadata in the schema
// itself (such as total number of rows, context info, or otherwise). The data should
// be passed using the same encoding as the metadata within the ArrowSchema struct
// itself (defined in the spec at
// https://arrow.apache.org/docs/format/CDataInterface.html#c.ArrowSchema.metadata)
//
// If addl_metadata is non-null then it only needs to exist for the lifetime of this
// call, a consumer who wants it to live after that must copy it to ensure lifetime.
// If there is any additional metadata tied to this stream, it will be provided as
// a non-null value for the `additional_metadata` field of the ArrowAsyncProducer
// which will be valid at least until the release callback is called.
//
// Return value: 0 if successful, `errno`-compatible error otherwise
//
// A producer that receives a non-zero return here should stop producing and eventually
// call release instead.
int (*on_schema)(struct ArrowAsyncDeviceStreamHandler* self,
struct ArrowSchema* stream_schema, const char* addl_metadata);
struct ArrowSchema* stream_schema);

// Handler for receiving data. This is called when data is available providing an
// ArrowAsyncTask struct to signify it. The producer indicates the end of the stream
Expand Down
21 changes: 14 additions & 7 deletions docs/source/format/CDeviceDataInterface.rst
Original file line number Diff line number Diff line change
Expand Up @@ -699,13 +699,14 @@ The C device async stream interface consists of three ``struct`` definitions:
void (*cancel)(struct ArrowAsyncProducer* self);
void (*release)(struct ArrowAsyncProducer* self);
const char* additional_metadata;
void* private_data;
};
struct ArrowAsyncDeviceStreamHandler {
// consumer-specific handlers
int (*on_schema)(struct ArrowAsyncDeviceStreamHandler* self,
struct ArrowSchema* stream_schema, const char* addl_metadata);
struct ArrowSchema* stream_schema);
int (*on_next_task)(struct ArrowAsyncDeviceStreamHandler* self,
struct ArrowAsyncTask* task, const char* metadata);
void (*on_error)(struct ArrowAsyncDeviceStreamHandler* self,
Expand Down Expand Up @@ -735,17 +736,16 @@ The ArrowAsyncDeviceStreamHandler structure

The structure has the following fields:

.. c:member:: int (*ArrowAsyncDeviceStreamHandler.on_schema)(struct ArrowAsyncDeviceStreamHandler*, struct ArrowSchema*, const char*)
.. c:member:: int (*ArrowAsyncDeviceStreamHandler.on_schema)(struct ArrowAsyncDeviceStreamHandler*, struct ArrowSchema*)
*Mandatory.* Handler for receiving the schema of the stream. All incoming records should
match the provided schema. If successful, the function should return 0, otherwise
it should return an ``errno``-compatible error code.

The ``const char*`` parameter exists for producers to provide any extra contextual information
they want, such as the total number of rows in the stream, statistics, or otherwise. This is
encoded in the same format as :c:member:`ArrowSchema.metadata`. If not ``NULL``,
the lifetime is only the scope of the call to this function. A consumer who wants to maintain
the additional metadata beyond the lifetime of this call *MUST* copy the value themselves.
If there is any extra contextual information that the producer wants to provide, it can set
:c:member:`ArrowAsyncProducer.additional_metadata` to a non-NULL value. This is encoded in the
same format as :c:member:`ArrowSchema.metadata`. The lifetime of this metadata, if not ``NULL``,
should be tied to the lifetime of the ``ArrowAsyncProducer`` object.

Unless the ``on_error`` handler is called, this will always get called exactly once and will be
the first method called on this object. As such the producer *MUST* populate the ``ArrowAsyncProducer``
Expand Down Expand Up @@ -909,6 +909,13 @@ This producer-provided and managed object has the following fields:
Any error encountered during handling a call to cancel must be reported via the ``on_error``
callback on the async stream handler.

.. c:member:: const char* ArrowAsyncProducer.additional_metadata
*Optional.* An additional metadata string to provide any extra context to the consumer. This *MUST*
either be ``NULL`` or a valid string that is encoded in the same way as :c:member:`ArrowSchema.metadata`.

If not ``NULL`` it *MUST* be valid for at least the lifetime of this object.

.. c:member:: void* ArrowAsyncProducer.private_data
*Optional.* An opaque pointer to producer-provided specific data.
Expand Down

0 comments on commit 5aa29a8

Please sign in to comment.