Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Keep pybind11 type casters alive recursively when needed #2527

Open
kenoslund-google opened this issue Sep 25, 2020 · 3 comments
Open

Comments

@kenoslund-google
Copy link

Issue description

When pybind11 casts from python->C++ for a function call, the casters for each argument are kept alive for the duration of the function call. This is important when passing a pointer or reference to memory owned by the caster (rather than owned by python). std::string is a common example of this- functions which take a std::string* or std::string& as one of their arguments (const or non-const) would fail without this behavior because the cast string would be freed before the function begins executing.

However, this behavior does not work recursively. For example, the list_caster constructs a caster for each element, moves the result out of it, and then discards the caster:

for (auto it : s) {
value_conv conv;
if (!conv.load(it, convert))
return false;
value.push_back(cast_op<Value &&>(std::move(conv)));
}

Thus, if you attempt to bind a function which takes a std::vector<std::string*> as one of its arguments, it will fail (and probably corrupt memory) because the std::string is owned by the element caster, which is destroyed before the function call begins.

This is particularly a problem when writing a caster for Spans (https://abseil.io/tips/93). Since the span does not confer ownership, the caster must retain ownership of a vector for the span to reference. This breaks with nested spans (Span<const Span<const T>>) exactly the same way std::vector<std::string*> breaks.

However, just keeping the casters alive recursively would be unnecessary and inefficient in many cases. For efficiency, I think the element casters should only be kept alive recursively if both of the following are true:

  1. The element caster owns the C++ representation.
  2. The element can't be moved out of the caster because moving the C++ type is not possible or does not also transfer ownership, such as with raw pointers.

The second condition is easy enough check- if the element type is a pointer, l-value reference, or the caster does not support the movable_cast_op_type, then it is satisfied. This can be checked by adding a r-value reference to the element type and feeding it through the element caster's cast_op_type:

using ElementCaster = make_caster<ElementType>;
static constexpr bool kCondition2 =
  !std::is_rvalue_reference_v<typename ElementCaster::template cast_op_type<
     typename std::add_rvalue_reference_t<ElementType>>>;

However, there does not appear to be a way to check the the first condition; type_casters would probably have to declare it, which would require an addition to the type_caster API.

Reproducible example code

C++

    // `std::vector<std::string*> does not work because the string is owned by
    // the element caster, which is destroyed before the function is called.
    m.def("load_vector_str_ptr", [](const std::vector<std::string*>& v) {
        // print more reliably triggers a memory error, making the use-after-free more obvious.
        py::print(*v.at(0), " && ", *v.at(1));
        return *v.at(0) == "test1" && *v.at(1) == "test2";
    });

python

assert m.load_vector_str_ptr(["test1", "test2"])

I'll upload a PR as soon as I figure out how...

@kenoslund-google
Copy link
Author

Pull request added: #2532

@YannickJadoul
Copy link
Collaborator

@kenoslund-google I think this might be the same problem as #2245? I have worked on this in #2303, but I couldn't fully figure out how to fit it into the existing casters-framework, and then forgot about the PR. Any ideas how to escape the problems mentioned there are welcome, though.

@EricCousineau-TRI
Copy link
Collaborator

However, there does not appear to be a way to check the the first condition [...]

I think this is already possible to check by seeing if the type requires conversion:
https://pybind11.readthedocs.io/en/stable/advanced/cast/index.html

For an example:

std::is_base_of<detail::type_caster_generic, detail::make_caster<T>>::value,

It may not capture all other cases; however, at that point I think it could be defined "extrinsically", by making it into py::detail::is_generic_type and have it be specializable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants