-
Notifications
You must be signed in to change notification settings - Fork 222
Conversation
Codecov Report
@@ Coverage Diff @@
## main #878 +/- ##
==========================================
- Coverage 71.68% 71.64% -0.04%
==========================================
Files 335 337 +2
Lines 18205 18452 +247
==========================================
+ Hits 13050 13220 +170
- Misses 5155 5232 +77
Continue to review full report at Codecov.
|
@jorgecarleitao along the same lines as the Parquet integration, the Potentially something to the effect of: // Wrapper type
pub struct Projection {
pub chunk: Chunk<Arc<dyn Array>>,
pub fields: Option<Vec<IpcField>>,
}
impl From<Chunk<Arc<dyn Array>>> for Projection {
fn from(chunk: Chunk<Arc<dyn Array>>) -> Self {
Self { chunk, fields: None }
}
}
// Usage
let mut sink = FileSink::new(..);
let chunks: Vec<Chunk<Arc<dyn Array>>> = vec![];
for chunk in chunks {
sink.feed(chunk.into()).await?;
} |
|
@jorgecarleitao I've added support for passing Per the example above, I've added a compatibility struct which allows calling the sinks as either |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! Thank you so much, @dexterduck - learned a lot in this PR!
/// An array [`Chunk`] with optional accompanying IPC fields. | ||
#[derive(Debug, Clone, PartialEq)] | ||
pub struct Record<'a> { | ||
columns: Cow<'a, Chunk<Arc<dyn Array>>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curious: what is the purpose of Cow
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea is to allow passing the arguments either by value or by reference. Obviously, better to avoid unnecessary clones if possible, but also wanted to avoid requiring a reference since then if you want to pass owned values you have to use the somewhat non-ergonomic sink.feed((&chunk).into())
or sink.feed((&chunk, &fields[..]).into())
. In contrast, with the chosen implementation you can use sink.feed(chunk.into())
or sink.feed((chunk, fields).into())
in all cases.
I originally used a generic implementation that parameterized Record
to accept a type implementing Borrow<_>
, but it turns out that having a single type that implements sink for multiple other types (e.g. FileSink
implementing both Sink<Record<Chunk<_>>>
and Sink<Record&<Chunk<_>>>
) doesn't work because the compiler can't infer the correct generic type when you call sink.flush()
or sink.close()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the explanation. Cool tip!
IPC component of this PR: #876
Adds the following new types:
io::ipc::read::file_async::FileStream
- implementsfutures::Stream
for IPC files.io::ipc::write::file_async::FileSink
- implementsfutures::Sink
for IPC files.io::ipc::write::file_async::StreamSink
- implementsfutures::Sink
for IPC streams.