-
Notifications
You must be signed in to change notification settings - Fork 778
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve performance when converting python bytes/bytearray to Vec<u8>
#2888
Comments
I understand this might be unsatisfying, but I think without stable specialization, the preferred course of action is manual downcasting to the relevant types like The |
I agree that adding That said, I've hit the |
I'm not sure I understand this part correctly, is that a suggestion for pyo3 itself or the user code? At the moment the best solution I found for the user code is using a dedicated wrapper type that implements struct BytesWrapper(Vec<u8>);
impl<'source> FromPyObject<'source> for BytesWrapper {
fn extract(object: &'source PyAny) -> PyResult<Self> {
Ok(BytesWrapper(match object.extract::<&PyByteArray>() {
Ok(x) => x.to_vec(),
Err(_) => object.extract::<&PyBytes>()?.as_bytes().to_vec(),
}))
}
}
impl From<BytesWrapper> for Vec<u8> {
fn from(data: BytesWrapper) -> Self {
data.0
}
} Then it can be used as such: #[pyfunction]
fn checksum_fixed(data: BytesWrapper) -> PyResult<u8> {
let data: Vec<u8> = data.into();
let mut result = 0;
for x in data {
result ^= x;
}
Ok(result)
} It's not entirely satisfactory though because the unwrapping can be quite annoying in some cases, say if the argument is something like #[pyfunction]
fn some_function(data: Option<Vec<BytesWrapper>>) -> PyResult<u8> {
let data: Option<Vec<Vec<u8>>> = data.map(|x| x.into_iter().map(|y| y.into()).collect());
...
} Any suggestion on how to address this issue? |
This was a suggestion for user code, but less about wrapping the result, but more about manually writing the "extract chains", e.g. #[derive(FromPyObject)]
enum BytesWrapper<'py> {
Bytes(&'py PyBytes),
ByteArray(&'py PyByteArray),
}
impl From<BytesWrapper<'_>> for Vec<u8> {
fn from(wrapper: BytesWrapper) -> Self {
match wrapper {
BytesWrapper::Bytes(bytes) => bytes.as_bytes().to_vec(),
BytesWrapper::ByteArray(byte_array) => byte_array.to_vec(),
}
}
} but of course this does not help with getting rid of the wrapper type. It might enable zero-copy usage though, e.g. impl<'py> From<BytesWrapper<'py>> for Cow<'py, [u8]> {
fn from(wrapper: BytesWrapper<'py>) -> Self {
match wrapper {
BytesWrapper::Bytes(bytes) => Cow::Borrowed(bytes.as_bytes()),
BytesWrapper::ByteArray(byte_array) => Cow::Owned(byte_array.to_vec()),
}
}
} In the approach you outlined above, I think implementing As written above, the proper solution here would be specialization so that we could provide specialized impls for
For now, I see two hacks which would be palatable to me personally:
|
Ah, this one of course does not work as the |
Looks nice, I'll do that :)
I think I'll implement an pub trait UnwrapBytesWrapper {
type ResultType;
fn unwrap_bytes(self) -> Self::ResultType;
}
impl UnwrapBytesWrapper for BytesWrapper<'_> {
type ResultType = Vec<u8>;
fn unwrap_bytes(self) -> Self::ResultType {
self.into()
}
}
impl<T> UnwrapBytesWrapper for Option<T>
where
T: UnwrapBytesWrapper,
{
type ResultType = Option<T::ResultType>;
fn unwrap_bytes(self) -> Self::ResultType {
self.map(|x| x.unwrap_bytes())
}
}
impl<T> UnwrapBytesWrapper for Vec<T>
where
T: UnwrapBytesWrapper,
{
type ResultType = Vec<T::ResultType>;
fn unwrap_bytes(self) -> Self::ResultType {
self.into_iter().map(|x| x.unwrap_bytes()).collect()
}
} This way I can use the same conversion code everywhere: #[pyfunction]
fn some_function(data: Option<Vec<BytesWrapper>>) -> PyResult<u8> {
let data = data.unwrap_bytes();
...
}
I'm looking forward to that, thanks for your help :) |
2899: RFC: Provide a special purpose FromPyObject impl for byte slices r=davidhewitt a=adamreichold This enables efficiently and safely getting a byte slice from either bytes or byte arrays. The main issue I see here is discoverability, i.e. should this be mention in the docs of `PyBytes` and `PyByteArray` or in the guide? It is also not completely clear whether this really _fixes_ the issue. Closes #2888 Co-authored-by: Adam Reichold <[email protected]>
Hi all, I think the bot went a bit too quick on closing this issue. PR #2899 is an interesting feature to PyO3 but I don't think it addresses the specific performance issue I reported. In my opinion, either If it is, the python iteration over the individual bytes should really be avoided as it makes programs much slower than their pure python alternative, which defeats an important purpose of PyO3. If it is not, it should not appear in the docs like it does today: pyo3/guide/src/conversions/tables.md Line 17 in 0a48859
I also quickly read through the comments in #2899 about banning |
I am happy to reopen this, as I agree the current situation around Agreed that either we need to make this clear in the documentation (presumably recommending the new |
Hi all, I noticed a performance issue when extracting a
PyBytes
or aPyByteArray
object into aVec<u8>
.This is an issue one can easily run into without realizing it. Here's a scenario, let's say we'd like to expose a simple checksum function:
See how it performs against the equivalent python implementation, processing 1MB a hundred times:
Looks really fast! However, it won't accept a bytearray as an argument:
So we update our implementation to take a
Vec<u8>
instead:And now the results:
It performs roughly the same as python, which makes sense if we look at the
FromPyObject
implementation forVec<T>
:pyo3/src/types/sequence.rs
Lines 314 to 318 in bed4f9d
The
bytes
/bytearray
object is iterated and each item (i.e a python integer) is separately extracted into au8
.This could be fixed by specializing the extract logic in the case of a
Vec<u8>
and use specific methods such asPyBytes::as_bytes().to_vec()
andPyByteArray::to_vec()
. Here's a possible patch:https://gist.github.com/vxgmichel/367e01e8504cb9c9e700a22525e8b68d
With this patch applied, the performance is now similar to what we had with the
&[u8]
slice:The text was updated successfully, but these errors were encountered: