-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Primitive Iterator API #689
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to get rid of per batch vector allocation. Otherwise there's some bounds that are redundant.
match self.dtype() { | ||
DType::Primitive(PType::I64, _) => { | ||
let accessor = Arc::new(self.clone()); | ||
Some(accessor) | ||
} | ||
_ => None, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this repeated unwrapping can probably be shoved into a macro, but doesn't have to be
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
introduced the primitive_accessor_ref
macro
@@ -25,6 +25,7 @@ vortex-scalar = { workspace = true } | |||
|
|||
[dev-dependencies] | |||
divan = { workspace = true } | |||
arrow = { workspace = true } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can probably disable the default-features for this import since you don't use ipc/json/csv encoding. probably saves a bit of compile time 🤷
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clippy wants me to disable it at the top-level and then have every crate in the workspace pull its own members, which is probably a good idea
@@ -0,0 +1,100 @@ | |||
use criterion::{criterion_group, criterion_main, BatchSize, Criterion}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are the results on my laptop:
std_iter_no_option time: [46.625 µs 47.232 µs 47.982 µs]
Found 9 outliers among 100 measurements (9.00%)
5 (5.00%) high mild
4 (4.00%) high severe
std_iter time: [394.24 µs 398.92 µs 403.92 µs]
Found 11 outliers among 100 measurements (11.00%)
2 (2.00%) low severe
1 (1.00%) low mild
2 (2.00%) high mild
6 (6.00%) high severe
vortex_iter time: [746.95 µs 750.73 µs 754.90 µs]
vortex_iter_flat time: [2.6330 ms 2.6477 ms 2.6635 ms]
Found 8 outliers among 100 measurements (8.00%)
7 (7.00%) high mild
1 (1.00%) high severe
arrow_iter time: [44.055 µs 44.165 µs 44.300 µs]
Found 5 outliers among 100 measurements (5.00%)
5 (5.00%) high mild
I'm surprised the Arrow iter is so much faster (and faster than std::iter). My guess is the assertions and alignment checking in our PrimitiveArray make up the difference?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Arrow doesn't actually iterator over Option just over T and in this case there's no nulls so arrow is the same as iterating Vec. To get to the level you need monomorphisation of every function call
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What @robert3005 said, spent a lot of time trying to understand everything they do and while I can't say I have full clarity, Arrow's overall design does make it easier to have fast iterators - both having fully typed arrays and not having to support compression.
I am hopeful that eventually we'll find a more performant way of iterating arrays, but just having a much smaller memory footprint should be useful IMO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
This PR includes a new batched iterator API, that yeilds a
Vec<T>
and validity data for batches of items from arrays. It can dynamically dispatch over the type of underlying arrays, so recursive compression should hold.The PR includes implementations for primitve arrays, as well as ALP as a test case for more complex encodings and constant array as its fairly trivial.
I also added a bunch of benchmarks to various pieces it touches to have some initial performance numbers and test basic correctness.