Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

object_store's alignment of get_ranges; ability to get BytesMut #6647

Closed
feefladder opened this issue Oct 29, 2024 · 2 comments
Closed

object_store's alignment of get_ranges; ability to get BytesMut #6647

feefladder opened this issue Oct 29, 2024 · 2 comments
Labels
question Further information is requested

Comments

@feefladder
Copy link

feefladder commented Oct 29, 2024

Which part is this question about
Alignment of returned Bytes from object_store.get_range() or object_store.get_ranges

Describe your question

Are there any alignment guarantees for get_range, or would that be a feature that could be added?

Additional context

I want to use object_store as a possible io for a crate I'm working on. Now, ideally, I'd be directly using the provided Bytes from the get_range(s) function. That needs two things:

  1. Alignment of the result is >= alignment of the underlying type (8-byte at most)
  2. We can convert to a BytesMut e.g. Does object_store hold any other references to the byte struct? (to fix endianness on the Bytes directly)

this seems to be the relevant code here. Bytes tests for misalignments

I know the upstream issues (tokio-rs/bytes/#437 tokio-rs/bytes/#343) and that zero-copy works for arrow-rs, which makes me think that alignment is at least somewhat worked out?

To me, a solution I'm thinking of is to get a large allocation, check the alignment of that and then hand out aligned chunks of that allocation. Since that would be upstream of the object_store crate (it produces the Bytes for the result), I'm asking here if there are any alignment guarantees of the get_range(s) function. If so, I'm happy and would only need to implement alignment-ness for other types.

@feefladder feefladder added the question Further information is requested label Oct 29, 2024
@tustvold
Copy link
Contributor

tustvold commented Oct 29, 2024

Are there any alignment guarantees for get_range, or would that be a feature that could be added?

We don't provide any guarantees and simply return the Bytes returned by reqwest, I don't know what alignment guarantees this provides jf any.

As for the arrow-rs zero-copy, that builds off the fact that we have code to realign buffers by copying them if necessary, so it is really only best effort zero-copy although works well in practice.

As for converting to BytesMut this may be possible, but again I don't know enough about how reqwest uses buffers internally

In general though I'd encourage you to get something working and then reach for zero-copy shenanigans based on empirical benchmarks, ultimately large contiguous memory copies are pretty fast, especially when compared to IO

@feefladder
Copy link
Author

Thanks for the quick response! my question is answered, so closing :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants