-
-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for memoryview and PickleBuffer #99
Comments
Can leverage the buffer protocol to make zero copy into numpy and then pass that; would this be suitable? (same goes for Prefer to steer clear of implementing support for an arbitrary amount of types, especially if those types already implement the buffer protocol. >>> import numpy as np
>>> data = memoryview(b'data')
>>> buf = np.frombuffer(data, dtype=np.uint8)
>>> cramjam.lz4.compress(buf)
cramjam.Buffer(len=23) |
In retrospect, can likely add a generic |
I seriously doubt the slowdown will be noticeable as long as the compression itself deals with 10kiB+ |
I agree that the individual types should not be explicitly implemented (numpy shouldn't either!) |
Will the suggested numpy.frombuffer work for you for now? |
Was interested and prototype'd it this morning: In [1]: import cramjam
In [2]: data = memoryview(b'data')
In [3]: cramjam.lz4.compress(data)
Out[3]: cramjam.Buffer(len=23)
In [4]: import array
In [5]: out = array.array('B', list(range(23)))
In [6]: out
Out[6]: array('B', [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22])
In [7]: cramjam.lz4.compress_into(data, out)
Out[7]: 23
In [8]: out
Out[8]: array('B', [4, 34, 77, 24, 68, 64, 94, 4, 0, 0, 128, 100, 97, 116, 97, 0, 0, 0, 0, 53, 138, 34, 33]) Guess a proper implementation would take a day or two. May take some time for me to get around to in my free time; hopefully the |
Like this? use pyo3::buffer::PyBuffer;
// value: &PyAny
let buf: PyBuffer<u8> = value.extract().unwrap(); // expose possible error
let rustbuf: &[u8] = unsafe {
slice::from_raw_parts(buf.buf_ptr() as *const u8, buf.len_bytes())
}; |
Roughly, basically how the prototype goes. Probably would need to wrap it similar to PyBytes/PyByteArray since all variants in Think whatever wrapper it is, |
Yes, the hack works. Didn't performance-test it but it should be negligible. |
Can try out |
It doesn't seem to work >>> cramjam.__version__
'2.7.0-rc2'
>>> cramjam.lz4.compress_block(bytearray(b"123"))
cramjam.Buffer<len=8>
>>> cramjam.lz4.compress_block(memoryview(b"123"))
cramjam.Buffer<len=8>
>>> cramjam.lz4.compress_block(numpy.ones(10))
TypeError: argument 'data': failed to extract enum BytesType ('Buffer | File | pybuffer')
- variant RustyBuffer (Buffer): TypeError: failed to extract field BytesType::RustyBuffer.0, caused by TypeError: 'ndarray' object cannot be converted to 'Buffer'
- variant RustyFile (File): TypeError: failed to extract field BytesType::RustyFile.0, caused by TypeError: 'ndarray' object cannot be converted to 'File'
- variant PyBuffer (pybuffer): TypeError: failed to extract field BytesType::PyBuffer.0, caused by BufferError: buffer contents are not compatible with u8
>>> cramjam.lz4.compress_block(memoryview(numpy.ones(10)))
TypeError: argument 'data': failed to extract enum BytesType ('Buffer | File | pybuffer')
- variant RustyBuffer (Buffer): TypeError: failed to extract field BytesType::RustyBuffer.0, caused by TypeError: 'memoryview' object cannot be converted to 'Buffer'
- variant RustyFile (File): TypeError: failed to extract field BytesType::RustyFile.0, caused by TypeError: 'memoryview' object cannot be converted to 'File'
- variant PyBuffer (pybuffer): TypeError: failed to extract field BytesType::PyBuffer.0, caused by BufferError: buffer contents are not compatible with u8 |
Needs to be bytes >>> np.ones(10).dtype
dtype('float64')
>>> cramjam.lz4.compress_block(np.ones(10, dtype=np.uint8)) # or np.ones(10).tobytes()
cramjam.Buffer<len=15> |
Or better yet |
Or m = memoryview(a).cast("B")
cramjam.lz4.compress_block(m) All other compression libraries don't have this caveat and are happy to ingest any PickleBuffer; could you fix it? (no rush) |
I agree, this is not great. I'd like to get the bytes view of buffers directly which can done, but the current implementation of |
@crusaderky v2.7.0rc3 ought to work for you. |
Yep works great! 👍 |
As of 2.7.0rc1, cramjam seems to be incompatible with memoryview and PickleBuffer objects.
This is a blocker to the adoption in dask/distributed.
The text was updated successfully, but these errors were encountered: