Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

C++ ownership without shared_ptr #1150

Closed
neutralid opened this issue Oct 19, 2017 · 10 comments
Closed

C++ ownership without shared_ptr #1150

neutralid opened this issue Oct 19, 2017 · 10 comments

Comments

@neutralid
Copy link

neutralid commented Oct 19, 2017

I have a simple class that expects a pointer to another class, both of which are bound and instantiated in Python.

class a {
public:
a() {printf("construct!\n");}
~a() {printf("destruct!\n");}
}

class b {
public:
b() {}
~b() {}
a* ptrToA;
}
// (along with some very basic binding code)
py::class_<a>(m, "a")
.def(py::init<>());
py::class_<b>(m, "b")
.def(py::init<>())
.def_readwrite("ptrToA", &b::ptrToA);

Python:

instanceOfB=b()
instanceOfB.ptrToA=a()

Running the Python shows a destructor call immediately after the object is constructed, so it appears c++ is only generating a weakref to the instance of A. I don't necessarily have the luxury of converting the actual code to use smart pointers which seems to be what the docs mostly cover, so is there an easy way in the binding code to specify that Python should keep the constructed a() alive while b (or really, b.ptrToA) is alive? I thought keep_alive was the answer I was looking for, but that did not seem to fix the issue (though admittedly, I may have gotten the usage wrong).

I mention ownership in the title, but maybe that's not quite the right word. I want Python to recognize the reference exists until B is destroyed, not to require c++ to delete ptrToA when done with it.

@jagerman
Copy link
Member

You're going to have to use def_property rather than def_readwrite (the latter is just a convenience wrapper around the former for simple attribute access) to get the keep_alive working on the setter:

    py::class_<a>(m, "a")
        .def(py::init<>());
    py::class_<b>(m, "b")
        .def(py::init<>())
        .def_property("ptrToA",
            [](b &self) { return self.ptrToA; },
            py::cpp_function([](b &self, a *aptr) { self.ptrToA = aptr; }, py::keep_alive<1, 2>())
        );

@neutralid
Copy link
Author

Thanks for the information. Your fix works for the trivial example I posted (though I'm a little curious, reading through the header I thought all the "extra" arguments were applied to the generated getter/setter functions when using def_readwrite).

For some reason, the fix doesn't work for the code I'm actually binding and I'm trying to figure out what's different. My pointer is a base class pointer and the object is a derived class, maybe that's the problem? I will see if I can build up a simple example of the issue I'm facing later today.

@jagerman
Copy link
Member

For more complicated cases, you're really hurting yourself by trying to avoid shared_ptr: what you're trying to do (sharing ownership) is exactly what shared_ptr is designed for. You might be able to hack together the correct dependency chain to get it to work, but I think it'll be fragile.

If shared_ptr isn't an option for the code being wrapped, you could keep the shared_ptr in the pybind binding code by using an always-initialized alias wrapper that holds a shared_ptr and copies the held pointer into the base class like this:

class PyB : public B {
public:
    using B::B;
    std::shared_ptr<A> a;
};

py::class_<A, std::shared_ptr<A>>(m, "A")
    .def(py::init<>());
py::class_<B, PyB>(m, "B")
    .def(py::init_alias<>()) // default constructor
    //.def(py::init_alias<int, std::string>()) // some other constructor
    .def_property("a", [](PyB &self) { return self.a; },
            [](PyB &self, std::shared_ptr<A> new_a) {
                self.a = std::move(new_a);
                self.aptr = self.a.get();
            })
    ;

@jagerman
Copy link
Member

jagerman commented Oct 19, 2017

A third option is similar to the above, but instead of storing a std::shared_ptr<A> a; in the alias class, you'd store a py::object a; and change the setter to:

    [](PyB &self, py::object new_a) {
        self.aptr = new_a.cast<A *>();
        self.a = std::move(new_a);
    }

This basically does the same thing as the shared_ptr, but does it through Python's shared pointer mechanism (i.e. reference counting) rather than C++'s, but has the disadvantage of garbage collection possibly happening at some later point rather than during the destruction of B.

Note also that both of these approaches require B to be polymorphic (i.e. to have at least one virtual method, typically the destructor).

@neutralid
Copy link
Author

Excellent information, thank you.
Our issue with shared_ptr is two fold. #1 is just usual politics and issues managing a large shared project where changes are a concern (and several people feel shared_ptrs have caused more trouble than they're worth in the past)
The second is the fact much of our code is used in c++ in addition to the Python interface we're creating, and shared_ptr is implying ownership by the target class which it shouldn't, in a pure c++ world I would have implemented these raw pointers as std::weak_ptr, since we're expecting them to be owned elsewhere and just used here.

So, admittedly it's an ill-formed problem. We want Python to assume ownership but c++ to avoid it. My big concern is passing this code off to the next developer, where they try to create an a and a b as I did in my example, and end up with a hard to debug segfault for what seems to be a straightforward operation. In a c++ world it's somewhat easy to grasp the lifetime issue of creating a class, passing it to another class, and having to be sure to keep it alive until that class no longer needs it, but because Python obscures that, it's a burden to put that on future developers.

@neutralid
Copy link
Author

I'm still playing with code but for future struggling devs stumbling on this page, I will warn you that the keep_alive<>() solution does appear to have the pitfall of not being able to sever the link if the pointer is re-assigned.

Using a slightly modified version of the code above that added a c subclass of a:

import test
x=test.b()
x.ptrToA=test.c()
construct!
x.ptrToA=test.a()
construct!
x.ptrToA=None
del x
destruct!
destruct!

@jagerman
Copy link
Member

When testing destruction triggered from python code, you'd do well to add import gc and do a gc.collect() after each potential point of object destruction to force the garbage collector to run.

@EricCousineau-TRI
Copy link
Collaborator

... in a pure c++ world I would have implemented these raw pointers as std::weak_ptr ... We want Python to assume ownership but c++ to avoid it.

Just to add in on Jason's statements, and given these statements in your above post, it seems like simply using weak_ptr<> or shared_ptr<> in your interface is the most optimal solution; the alternative is to use raw pointers, and hand-craft your lifetime relationships in Python as you would in C++ (something we've considered for some of our stuff in Drake, but nothing we'd wish on our downstream users - to have their interpreter segfault because they forgot to keep a reference alive).

The caveat, as you hinted at before, is that you know you have to shoehorn your object lifetimes into shared_ptr<>, which breaks compatibility with non-shared_ptr<>-owned objects (which really sucks, but alas, such is the drawback of STL); however, that may be a price worth paying to for the future struggling devs. (Another option that someone had mentioned is to consider having two entry points of ownership for your C++ APIs: unique [or shared] ownership for Python-like memory management, and non-owned references for when you want to hand-craft lifetime.)

Wanting C++ to not strictly own the object (in the unique_ptr<> sense), but to keep the Python object alive (whether it's a relevant patient object, or actually the Python portion of a Python class subclassing a pybind11 class) is a matter of shared ownership, which as Jason mentioned is exactly what shared_ptr<> is meant to do.

AFAICT, there isn't going to be another solution that doesn't involve some sort of shared-ownership mechanism, whether you store a shared_ptr<> or a py::object reference; another caveat to py::object being that you can now have an opaque reference cycle that will defer garbage collection to the death of the interpreter (and may screw up destruction order), since Python doesn't know the connection between your instance and the py::object it contains.

(If you do think you have something robust, though, do let us know!)

@neutralid
Copy link
Author

I think the main takeaway of my post is that I wish there were an "ownership transfer" I could identify, where C++ used a weak_ptr interface and Python used a shared_ptr interface (that could ideally exist without modifying the original source). SIP bindings, which I used prior to pybind11, allowed notation to indicate whether you wanted Python or C++ to be responsible for an object, but I assume that was just a benefit to using a language that had its own parser versus trying to use built in language semantics.

For the time being, we hacked in a somewhat ugly workaround where we created a "PyPtrWrapper" class, which was returned via the property getter in place of binding the actual pointer. That wrapper class had a static vector of py::object, and we just copied the provided object on set into that array. On set, we also cast the old value back to a py::object and removed an instance of that object from the vector if found, so repeated assignments don't cause memory leaks (though it is not smart enough to clean up if the object is destroyed, fortunately that's not a normal use case for us).

@bstaletic
Copy link
Collaborator

This issue seems resolved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants