Skip to content

Commit

Permalink
Merge #2899
Browse files Browse the repository at this point in the history
2899: RFC: Provide a special purpose FromPyObject impl for byte slices r=davidhewitt a=adamreichold

This enables efficiently and safely getting a byte slice from either bytes or byte arrays.

The main issue I see here is discoverability, i.e. should this be mention in the docs of `PyBytes` and `PyByteArray` or in the guide?

It is also not completely clear whether this really _fixes_ the issue.

Closes #2888 

Co-authored-by: Adam Reichold <[email protected]>
  • Loading branch information
bors[bot] and adamreichold authored Feb 22, 2023
2 parents 12ce8d2 + fb3450b commit e426b78
Show file tree
Hide file tree
Showing 3 changed files with 67 additions and 6 deletions.
5 changes: 3 additions & 2 deletions guide/src/conversions/tables.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ The table below contains the Python type and the corresponding function argument
| ------------- |:-------------------------------:|:--------------------:|
| `object` | - | `&PyAny` |
| `str` | `String`, `Cow<str>`, `&str`, `OsString`, `PathBuf` | `&PyUnicode` |
| `bytes` | `Vec<u8>`, `&[u8]` | `&PyBytes` |
| `bytes` | `Vec<u8>`, `&[u8]`, `Cow<[u8]>` | `&PyBytes` |
| `bool` | `bool` | `&PyBool` |
| `int` | Any integer type (`i32`, `u32`, `usize`, etc) | `&PyLong` |
| `float` | `f32`, `f64` | `&PyFloat` |
Expand All @@ -24,7 +24,7 @@ The table below contains the Python type and the corresponding function argument
| `tuple[T, U]` | `(T, U)`, `Vec<T>` | `&PyTuple` |
| `set[T]` | `HashSet<T>`, `BTreeSet<T>`, `hashbrown::HashSet<T>`[^2] | `&PySet` |
| `frozenset[T]` | `HashSet<T>`, `BTreeSet<T>`, `hashbrown::HashSet<T>`[^2] | `&PyFrozenSet` |
| `bytearray` | `Vec<u8>` | `&PyByteArray` |
| `bytearray` | `Vec<u8>`, `Cow<[u8]>` | `&PyByteArray` |
| `slice` | - | `&PySlice` |
| `type` | - | `&PyType` |
| `module` | - | `&PyModule` |
Expand Down Expand Up @@ -84,6 +84,7 @@ Finally, the following Rust types are also able to convert to Python as return v
| `Option<T>` | `Optional[T]` |
| `(T, U)` | `Tuple[T, U]` |
| `Vec<T>` | `List[T]` |
| `Cow<[u8]>` | `bytes` |
| `HashMap<K, V>` | `Dict[K, V]` |
| `BTreeMap<K, V>` | `Dict[K, V]` |
| `HashSet<T>` | `Set[T]` |
Expand Down
1 change: 1 addition & 0 deletions newsfragments/2899.added.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Add special-purpose impl of `FromPyObject` for `Cow<[u8]>` to efficiently handle both `bytes` and `bytearray` objects
67 changes: 63 additions & 4 deletions src/types/bytes.rs
Original file line number Diff line number Diff line change
@@ -1,9 +1,12 @@
use crate::{ffi, AsPyPointer, Py, PyAny, PyResult, Python};
use crate::{ffi, AsPyPointer, FromPyObject, IntoPy, Py, PyAny, PyResult, Python, ToPyObject};
use std::borrow::Cow;
use std::ops::Index;
use std::os::raw::c_char;
use std::slice::SliceIndex;
use std::str;

use super::bytearray::PyByteArray;

/// Represents a Python `bytes` object.
///
/// This type is immutable.
Expand Down Expand Up @@ -121,11 +124,39 @@ impl<I: SliceIndex<[u8]>> Index<I> for PyBytes {
}
}

/// Special-purpose trait impl to efficiently handle both `bytes` and `bytearray`
///
/// If the source object is a `bytes` object, the `Cow` will be borrowed and
/// pointing into the source object, and no copying or heap allocations will happen.
/// If it is a `bytearray`, its contents will be copied to an owned `Cow`.
impl<'source> FromPyObject<'source> for Cow<'source, [u8]> {
fn extract(ob: &'source PyAny) -> PyResult<Self> {
if let Ok(bytes) = ob.downcast::<PyBytes>() {
return Ok(Cow::Borrowed(bytes.as_bytes()));
}

let byte_array = ob.downcast::<PyByteArray>()?;
Ok(Cow::Owned(byte_array.to_vec()))
}
}

impl ToPyObject for Cow<'_, [u8]> {
fn to_object(&self, py: Python<'_>) -> Py<PyAny> {
PyBytes::new(py, self.as_ref()).into()
}
}

impl IntoPy<Py<PyAny>> for Cow<'_, [u8]> {
fn into_py(self, py: Python<'_>) -> Py<PyAny> {
self.to_object(py)
}
}

#[cfg(test)]
mod tests {
use super::PyBytes;
use crate::FromPyObject;
use crate::Python;
use super::*;

use crate::types::IntoPyDict;

#[test]
fn test_bytes_index() {
Expand Down Expand Up @@ -172,4 +203,32 @@ mod tests {
.is_instance_of::<PyValueError>(py));
});
}

#[test]
fn test_cow_impl() {
Python::with_gil(|py| {
let bytes = py.eval(r#"b"foobar""#, None, None).unwrap();
let cow = bytes.extract::<Cow<'_, [u8]>>().unwrap();
assert_eq!(cow, Cow::<[u8]>::Borrowed(b"foobar"));

let byte_array = py.eval(r#"bytearray(b"foobar")"#, None, None).unwrap();
let cow = byte_array.extract::<Cow<'_, [u8]>>().unwrap();
assert_eq!(cow, Cow::<[u8]>::Owned(b"foobar".to_vec()));

let something_else_entirely = py.eval("42", None, None).unwrap();
something_else_entirely
.extract::<Cow<'_, [u8]>>()
.unwrap_err();

let cow = Cow::<[u8]>::Borrowed(b"foobar").into_py(py);
let locals = [("cow", cow)].into_py_dict(py);
py.run("assert isinstance(cow, bytes)", Some(locals), None)
.unwrap();

let cow = Cow::<[u8]>::Owned(b"foobar".to_vec()).into_py(py);
let locals = [("cow", cow)].into_py_dict(py);
py.run("assert isinstance(cow, bytes)", Some(locals), None)
.unwrap();
});
}
}

0 comments on commit e426b78

Please sign in to comment.