-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix #4459 (cpp atexit callbacks) without segfault (#4500) #4505
base: master
Are you sure you want to change the base?
Fix #4459 (cpp atexit callbacks) without segfault (#4500) #4505
Conversation
Hmm, looks like some compilers are being a bit too clever and delaying the initialization of the static variable or otherwise reordering it when optimizations are enabled. This does unfortunately complicate the fix a bit. Thoughts @rwgk? |
The failing test is I spent 10ish minutes looking around but still feel pretty clueless. Wild idea, potentially silly: What if we clear only before we initialize a new interpreter, e.g. near the top of |
Same issue, we can't call get_local_internals without calling get_internals which requires a GIL lock and thread local storage. The GIL can only be locked if the Python interpreter is initialized. |
@rwgk I don't think the error is necessarily specific to this test, I think it's just the first test which access local_internals and therefore initializes it. |
// avoid undefined behaviors when initializing another interpreter | ||
local_internals.registered_types_cpp.clear(); | ||
local_internals.registered_exception_translators.clear(); | ||
|
||
if (internals_ptr_ptr) { | ||
delete *internals_ptr_ptr; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rwgk It just occurred to me, that the reason deleting the internals_ptr_ptr here works is that it effectively leaks the containers in internals since it just deallocates them without calling the dtors. This shouldn't be working here because if the dtors did run, then the capsule dtor for builtins object would be called which requires the CPython API which would segfault. What this is actually doing is just leaking the objects. If we also changed the behavior to improperly swap out pointers to the container, that would work, but it's not a longterm fix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
effectively leaks the containers in internals since it just deallocates them without calling the dtors
Yes, I was aware of that. Just to confirm.
Some background that may or may not matter here, sharing JIC:
Until Python 3.9 or 3.10 (not sure anymore) Python itself was leaking (https://github.com/rwgk/stuff/blob/f6549ecadb46169feb5f3c2456dcd13852382f7a/noddy/embedded_noddy_main_run.c), but with one of those versions that was cleaned up.
Before the cleanup in Python I was thinking: well, there is no point in being clean in pybind11. Therefore I wasn't concerned about any internals leaks.
That thinking has changed now. But who has the free energy to cleanup pybind11? And is it even doable without an ABI break? I don't know.
To come back to the problem here: I wouldn't worry about leaks, still. We "just" need to satisfy the situations of #3744 and #4500 simultaneously, with or without leaks.
Then worry about the leaks later.
Concretely: could we simply replicate the ptr_ptr tricked used by (non-local) internals? I.e. leak the local internals, too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we will need an ABI break at some point.
Concretely: could we simply replicate the ptr_ptr tricked used by (non-local) internals? I.e. leak the local internals, too?
Probably? wouldn't that be an ABI break in itself though?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably? wouldn't that be an ABI break in itself though?
I don't think so.
@lalaland for a 2nd opinion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can change local internals without an ABI break?
I think the bigger issue is that it might be an API break, since adding leaking behavior is explicitly changing the behavior of existing code. And in memory tight situations, that can be quite bad.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the bigger issue is that it might be an API break, since adding leaking behavior is explicitly changing the behavior of existing code. And in memory tight situations, that can be quite bad.
Absolutely, it's a rock-and-a-hard place situation.
Ideally someone rolls up their sleeves and wrestles it down properly with whatever it takes.
I cannot do that: embedding has no critical use at Google. The only reason I'm looking is general pybind11 health.
So this here boils down to: what do we want to do given very limited resources?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would vote for leaking now to fix the segfaults as a quick patch, but actually try to think of a way to eliminate the leak as a more long term project.
There has got to be some way of fixing this in the long term.
@Kalakaarboyz1 Sadly, this doesn't work on all compilers currently, so we need to fix that. |
Slightly random thoughts & at a random time (from the back of my mind):
My ideas for the community of users having a vested interest in multiple interpreter initialize/finalize cycles in a given process: 0a. Just don't do it. Initialize the interpreter only once in a given process, and only finalize it once, when the process terminates. 0b. If you really must have multiple interpreter initialize/finalize cycles, write a full rationale for why it's worth putting up with the trouble. Add to the pybind11 documentation.
|
I could be in favor of 0a, atexit seems like a more common use case than multiple interperters. Honestly, it's our slopping handle of get_internals_state, and the fact we can't control when to try to regenerate it. Ideally, we should have seperated the Python Internals State from the C++ one in some way. |
Description
Fix #4459 without introducing the segfault discovered in #4500. This is a nasty hotfix, but it should support both use cases without crashing.
This segfault is due to some nasty behavior with the side effects of get_local_internals which currently relies on UB behavior. Full writeup can be found here: #4500 (comment)
Suggested changelog entry: