Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

append_nulls and append_trusted_len_iter for PrimitiveBuilder #725

Closed
bjchambers opened this issue Aug 28, 2021 · 1 comment · Fixed by #728
Closed

append_nulls and append_trusted_len_iter for PrimitiveBuilder #725

bjchambers opened this issue Aug 28, 2021 · 1 comment · Fixed by #728
Labels
enhancement Any new improvement worthy of a entry in the changelog

Comments

@bjchambers
Copy link
Contributor

Is your feature request related to a problem or challenge? Please describe what you are trying to do.

I have cases where I've identified a range of numbers to add to a builder. Or a number of nulls to add. My only options with the existing PrimitiveBuilder interfaces is to either (a) add each value/null individually or (b) create a vector containing the numbers/nulls I wish to add, and then use append_values or append_slice.

Both of these extra work than is strictly necessary -- in one case, capacity checks after every value and in the other materializing an intermediate vector.

Describe the solution you'd like

  • pub fn append_nulls(&mut self, num_nulls: usize) -> Result<(), ArrowError>. Same as calling append_nulls num_nulls times.
  • pub fn append_trusted_len_iter_values(&mut self, iter: impl IntoIterator<T::Native>) -> Result<(), ArrowError>. Behaves similarly to collecting the items into a vector and then calling append_slice(&vector).
  • Could also add a version for Option<T::Native>.

Describe alternatives you've considered
As noted in the description -- multiple calls to append_value/append_null (not ideal due to checking the capacity after each call) or creating a vector and then calling append_values or append_slice (not ideal due to the need to collect/materialize values first).

@bjchambers bjchambers added the enhancement Any new improvement worthy of a entry in the changelog label Aug 28, 2021
@bjchambers
Copy link
Contributor Author

bjchambers commented Aug 28, 2021

I have a prototype of these methods, that improved the performance of a 2-way merge kernel 20-30% in some 100k element microbenchmarks.

https://github.com/bjchambers/arrow-rs/tree/725-append_nulls_iter

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Any new improvement worthy of a entry in the changelog
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant