Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use weakref to clean up captured function object in def_buffer #2634

Merged
merged 1 commit into from
Nov 2, 2020

Conversation

YannickJadoul
Copy link
Collaborator

@YannickJadoul YannickJadoul commented Nov 1, 2020

Description

Leak discovered independently by @bstaletic (see #2578 (comment)) and @rwgk. This is one solution that cleans up after itself.

A few things to be considered:

  • I would think this is safe, and the order of destruction with the class_ object shouldn't matter? (cfr. py::keep_alive: nurse destructor can run before patient's #856) But I wouldn't mind a few people sanity-checking this, please :-)

  • Another solution, suggested by @rwgk, would be to delete the capture as part of destructing detail::type_info. In order to do so, we would need some extra type info (since the class_<T, ...>::get_buffer<F>::capture * gets type-erased to void *); one solution would be to use a shared_ptr<void> (which remembers its deleter), or similar. Is this a better solution?

  • If we do consider the solution above as better: this does require increasing the ABI version counter. Do we want to do this now, or go with the previous solution until we have a few ABI-breaking changes together and do them all at once? (Meanwhile, we could create a list of things we'd like to do that would break the ABI.)

Suggested changelog entry:

Seal a leak in `def_buffer`, cleaning up the `capture` object after the `class_` object goes out of scope.

Copy link
Collaborator

@rwgk rwgk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran this through the Google-internal global testing and didn't find any issues.
As discussed on Slack, this PR fixes ASAN errors.
My vote is to merge this PR as-is as soon as possible, and include it in 2.6.1.

Copy link
Collaborator

@henryiii henryiii left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @rwgk

@YannickJadoul
Copy link
Collaborator Author

Well, there's no reason to not include it in 2.6.1, so since this was a fix, that was already implicit, to me.

I would want to get those two other questions discussed/answered, though, before merging this.

@YannickJadoul YannickJadoul added this to the v2.6.1 milestone Nov 2, 2020
@henryiii
Copy link
Collaborator

henryiii commented Nov 2, 2020

We will not bump the ABI counter until at least 2.7, and maybe a future version (especially if 2.7 focuses on #2445). We should probably combine several changes then - I have seen at least one PR with "let's add this when we bump the ABI counter" (which should probably be a label).

@YannickJadoul
Copy link
Collaborator Author

We will not bump the ABI counter until at least 2.7

But would we want to, in this case? I.e., should we add a label to this PR/add this to a list somewhere?

@bstaletic
Copy link
Collaborator

I think this would be a case where C++ smart pointers would get in the way, because you're passing a void* to python. The weakref solution just seems more elegant.

@rwgk
Copy link
Collaborator

rwgk commented Nov 2, 2020

I think this would be a case where C++ smart pointers would get in the way, because you're passing a void* to python. The weakref solution just seems more elegant.

Just for completeness:

https://www.nextptr.com/tutorial/ta1227747841/the-stdshared_ptrvoid-as-arbitrary-userdata-pointer

seems like a perfect fit for this situation, except it means we'd need to bump ABI version counter.

My thinking: Yannick's solution is "fine forever". It solves the problem without the need to mess with the ABI version counter. IIUC the solution relies on more moving parts than the shared_ptr<void> alternative, but in the unlikely event that we have something like python/cpython#22670, we can just push for pre-release fixes in new Python versions. I wouldn't revisit this leak unless we really run out of other important things to do.

@YannickJadoul
Copy link
Collaborator Author

I think this would be a case where C++ smart pointers would get in the way

We wouldn't per se need shared_ptr. We could also store a void (*deleter)(void *); in detail::type_info and use it when cleaning up.

But yes, I agree it's probably OK to keep it like this.

but in the unlikely event that we have something like python/cpython#22670, we can just push for pre-release fixes in new Python versions.

For the record: this is not exactly the ordering I was referencing (but the thing that when multiple things get garbage-collected at the same time, the order of destruction is undefined; I guess we should be fine, though). But yes, agreed.

Comment on lines +1328 to +1331
weakref(m_ptr, cpp_function([ptr](handle wr) {
delete ptr;
wr.dec_ref();
})).release();
Copy link
Collaborator Author

@YannickJadoul YannickJadoul Nov 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking we could still refactor this to something like

template <typename F>
void cleanup_through_weakref(handle parent, F &&f) {
    weakref(parent, cpp_function([f=std::forward<F>(f)](handle wr) {
        f();
        wr.dec_ref();
    })).release();
};

But we'd also need some workaround for C++11's inability to capture-by-move. So let's maybe leave this for a next PR, if this pattern keeps coming back.

@YannickJadoul
Copy link
Collaborator Author

Right, two approvals, a non-complaining @bstaletic, and a satisfied valgrind/ASAN/MSAN. Should be good enough to merge :-)

@YannickJadoul YannickJadoul merged commit 7d6713a into pybind:master Nov 2, 2020
@YannickJadoul
Copy link
Collaborator Author

Thanks, all!

@YannickJadoul YannickJadoul deleted the def_buffer-leak branch November 2, 2020 17:39
@henryiii henryiii added needs changelog Possibly needs a changelog entry and removed needs changelog Possibly needs a changelog entry labels Nov 5, 2020
@rwgk rwgk mentioned this pull request Dec 28, 2020
10 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants