-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Double-nested gil_scoped_release in a pythread causes fatal interpreter error #1276
Comments
Note: the fatal error only happens if you are running a DEBUG build of the Python interpreter (VERY important). |
Having pybind11 keep its own internal thread state can lead to an inconsistent situation where the Python interpreter has a thread state but pybind does not, and then when gil_scoped_acquire is called, pybind creates a new thread state instead of using the one created by the Python interpreter. This change gets rid of pybind's internal thread state and always uses the one created by the Python interpreter.
Having pybind11 keep its own internal thread state can lead to an inconsistent situation where the Python interpreter has a thread state but pybind does not, and then when gil_scoped_acquire is called, pybind creates a new thread state instead of using the one created by the Python interpreter. This change gets rid of pybind's internal thread state and always uses the one created by the Python interpreter.
Here's an even more concrete and tangible example that doesn't require a debug build of Python: #include <pybind11/pybind11.h>
#include <pybind11/embed.h>
#include <iostream>
#include <thread>
static void thread()
{
pybind11::gil_scoped_acquire acquire;
{
// Call some Python code
// ...
// Python code calls a 3rd party library which does this
PyGILState_STATE state = PyGILState_Ensure();
std::cout << "Calling 3rd party library" << std::endl;
PyGILState_Release(state);
}
}
static const char thread_code[] =
"import threading\n"
"import testmod\n"
"t = threading.Thread(target=testmod.func)\n"
"t.start()\n"
"t.join()\n";
PYBIND11_EMBEDDED_MODULE(testmod, m)
{
m.def("func", thread, pybind11::call_guard<pybind11::gil_scoped_release>());
}
int main()
{
pybind11::scoped_interpreter interp;
pybind11::gil_scoped_release release;
{
pybind11::gil_scoped_acquire acquire;
pybind11::exec(thread_code);
}
return 0;
} This one deadlocks as soon as it reaches the Normally, the The solution to this problem is to get rid of pybind's internal thread states and use the |
Calling pybind11's implementation "bastardized" is quite inappropriate. It enables several useful applications that are impossible to realize with Python's restricted GIL state API. Naturally, the two are not supposed to be combined in this way. |
There is really nothing forcing you to use |
My apologies, I wasn't trying to be inappropriate. In English, we sometimes use "bastardized" as another word for "hacked" or "modified" - perhaps one of those would have been more appropriate. In any case, I do understand what this is trying to accomplish, but the behavior was not what I was expecting. My biggest concern is that the |
@wjakob continuing our conversation, let summarize the main technical points and respond to some issues here, and perhaps @jagerman can chime in with his thoughts. There are currently two options for interfacing with the GIL
These methods are incompatible and cannot be nested because they each maintain there own state and do not synchronize with each other. Unfortunately, as noted in pybind11.h, the standard In my opinion, because this incompatibility exists and most programs will not use the advanced features of the current interface, I believe that the pybind11 API should be changed to be compatible by default and expose a simple way to use the current interface that allows for advanced thread control. This way it would be the responsibility of the developer explicitly using the advanced features to ensure that nested calls were safe (or at least provide some warning or a note in the docs). However, as @wjakob points out
This is a valid point, breaking compatibility does require a major version change. I hope to convince you that its worthwhile to do so. My main points of my argument are:
I very much understand this. Ideally we can come up with a good solution that makes upgrading to a new API simple. Of course this means that the new version must have an API to behave like the current version. (Alternatively we can petition for Python 4 to integrate the improved GIL API or we can wait for @larryhastings to finish his gilectomy). Hopefully, I've laid out a reasonable argument as to why changing the current behavior of pybind11 is desirable. As for discussing the specifics of those changes (maybe even with ones that don't require a major version bump) I'll make a post in #1276 |
@Erotemic's comments on this matter are spot on. Here are some of my additional thoughts. We could simply write our own In short, there are two issues here:
@Erotemic forwarded me this from @wjakob:
Can you explain your thoughts here? What I'm thinking is that, at least to the end user, there isn't a great deal of difference between my proposal and adding a One concern that @jagerman had is that having a global runtime option could change the value for other libraries (if library A uses the proposed "basic" @Erotemic also came up with the idea of having both a runtime and compile-time option, where the compile-time option sets the default for the runtime option, but the runtime option can still be changed. Another option: instead of simply having an either/or between the basic and advanced APIs, what if we created a template function to allow the user to select any class they want? Example: pybind11::options options;
options.set_gil_scoped_acquire<basic_gil_scoped_acquire>(); Internally, this would use a virtual class with the selected class as a templated member. We could also do something similar as a compile time option. Example: #define PYBIND11_GIL_SCOPED_ACQUIRE basic_gil_scoped_acquire And then inside pybind: #ifndef PYBIND11_GIL_SCOPED_ACQUIRE
#define PYBIND11_GIL_SCOPED_ACQUIRE ::pybind11::gil_scoped_acquire
#endif One thing is for sure: as pybind becomes more widespread, and larger and more complicated projects start using it, they too are going to encounter this issue. It is certainly having a major impact on KWIVER, so I will take it upon myself to implement any proposal we can come up with. I hope that we can find a solution that meets everyone's needs. |
Some food for thought:
Yeah, I was thinking ahead a bit with that concern: It doesn't happen because, right now, the A much more natural place for this sort of global runtime control is in But back to the core issue. The biggest issue, as I see it, is that we are forcing the use of As to what that fix looks like: I don't think that either the I don't personally have much experience with the GIL intricacies, so maybe there's a reason this wouldn't work, but we ought to be able to detect whether a Python "simple" GIL or a pybind "advanced" GIL is already acquired, and if so, we could have a default We could then rename |
This certainly seems doable. It would just be a matter of calling If we're going to have something like Actually... as I'm thinking about this some more, maybe
I like this idea too. Unfortunately, it won't help for projects that have been compiled with a version of pybind prior to the implementation of this proposal, but it should at least help going forward. If we make some sort of new builtin object, like |
I have opened #1322. It's not finished, but it should be enough to get a discussion going. |
Hi Kyle, I took a look at PR #1322. For me, it is too complicated. At its core, the GIL is something straightforward (a reference-counted mutex), and I'd be hesitant to introduce such complex machinery to deal with it in pybind11. (I am interested in resolving this issue though, so let's try to figure out what can be done.) It's been a long time since I wrote the GIL handling in pybind11, and I just took a moment to go through it again. This made me realize that CPython's GILState and pybind11's "advanced GIL" are really supposed to be compatible. In fact, the reason why the code is a bit complicated is because pybind11 has to jump through some hoops to get access to the CPython internal variables, which it then tries to update in exactly the same way. You can take a look e.g. at https://github.com/python/cpython/blob/master/Python/pystate.c#L1040. The "tss key" corresponds exactly to So rather than cooking up a completely new subsystem, I think the way to go is to ensure that whatever pybind11 does is compatible with CPython. It sounds to me like you have potentially run into a corner case where the handling slightly differs. Now that said, I don't really "get" the code you posted before. Why can't you write the following? This works perfectly on my machine: #include <pybind11/pybind11.h>
#include <pybind11/embed.h>
#include <iostream>
#include <thread>
static void thread()
{
pybind11::gil_scoped_acquire acquire;
{
// Pretend this block is nested inside Python
pybind11::gil_scoped_release release;
std::cout << "Calling C++ code" << std::endl;
}
}
static const char thread_code[] =
"import threading\n"
"import testmod\n"
"t = threading.Thread(target=testmod.func)\n"
"t.start()\n"
"t.join()\n";
PYBIND11_EMBEDDED_MODULE(testmod, m)
{
m.def("func", thread, pybind11::call_guard<pybind11::gil_scoped_release>());
}
int main()
{
pybind11::scoped_interpreter interp;
pybind11::exec(thread_code);
return 0;
} Best, |
The crux of this entire issue is that "tss key". In Python 2.7, it's called The first example I gave was when I was in the earlier stages of tracking down this issue. It gets triggered because Python internally uses the But back to my original point: none of this would be an issue if we had access to If we can find a way to get the advanced API to use whatever thread state was created with |
Any more ideas on what we can do here? |
See pybind#1276 for the rationale.
…gic to support nested gil access, see pybind#1276 and pytorch/pytorch#83101
…gic to support nested gil access, see pybind#1276 and pytorch/pytorch#83101
…gic to support nested gil access, see pybind#1276 and pytorch/pytorch#83101
…gic to support nested gil access, see pybind#1276 and pytorch/pytorch#83101
* Add option to force the use of the PYPY GIL scoped acquire/release logic to support nested gil access, see #1276 and pytorch/pytorch#83101 * Apply suggestions from code review * Update CMakeLists.txt * docs: update upgrade guide * Update docs/upgrade.rst * All bells & whistles. * Add Reminder to common.h, so that we will not forget to purge `!WITH_THREAD` branches when dropping Python 3.6 * New sentence instead of semicolon. * Temporarily pull in snapshot of PR #4246 * Add `test_release_acquire` * Add more unit tests for nested gil locking * Add test_report_builtins_internals_keys * Very minor enhancement: sort list only after filtering. * Revert change in docs/upgrade.rst * Add test_multi_acquire_release_cross_module, while also forcing unique PYBIND11_INTERNALS_VERSION for cross_module_gil_utils.cpp * Hopefully fix apparently new ICC error. ``` 2022-10-28T07:57:54.5187728Z -- The CXX compiler identification is Intel 2021.7.0.20220726 ... 2022-10-28T07:58:53.6758994Z icpc: remark #10441: The Intel(R) C++ Compiler Classic (ICC) is deprecated and will be removed from product release in the second half of 2023. The Intel(R) oneAPI DPC++/C++ Compiler (ICX) is the recommended compiler moving forward. Please transition to use this compiler. Use '-diag-disable=10441' to disable this message. 2022-10-28T07:58:54.5801597Z In file included from /home/runner/work/pybind11/pybind11/include/pybind11/detail/../detail/type_caster_base.h(15), 2022-10-28T07:58:54.5803794Z from /home/runner/work/pybind11/pybind11/include/pybind11/detail/../cast.h(15), 2022-10-28T07:58:54.5805740Z from /home/runner/work/pybind11/pybind11/include/pybind11/detail/../attr.h(14), 2022-10-28T07:58:54.5809556Z from /home/runner/work/pybind11/pybind11/include/pybind11/detail/class.h(12), 2022-10-28T07:58:54.5812154Z from /home/runner/work/pybind11/pybind11/include/pybind11/pybind11.h(13), 2022-10-28T07:58:54.5948523Z from /home/runner/work/pybind11/pybind11/tests/cross_module_gil_utils.cpp(13): 2022-10-28T07:58:54.5949009Z /home/runner/work/pybind11/pybind11/include/pybind11/detail/../detail/internals.h(177): error #2282: unrecognized GCC pragma 2022-10-28T07:58:54.5949374Z PYBIND11_TLS_KEY_INIT(tstate) 2022-10-28T07:58:54.5949579Z ^ 2022-10-28T07:58:54.5949695Z ``` * clang-tidy fixes * Workaround for PYPY WIN exitcode None * Revert "Temporarily pull in snapshot of PR #4246" This reverts commit 23ac16e. * Another workaround for PYPY WIN exitcode None * Clean up how the tests are run "run in process" Part 1: uniformity * Clean up how the tests are run "run in process" Part 2: use `@pytest.mark.parametrize` and clean up the naming. * Skip some tests `#if defined(THREAD_SANITIZER)` (tested with TSAN using the Google-internal toolchain). * Run all tests again but ignore ThreadSanitizer exitcode 66 (this is less likely to mask unrelated ThreadSanitizer issues in the future). * bug fix: missing common.h include before using `PYBIND11_SIMPLE_GIL_MANAGEMENT` For the tests in the github CI this does not matter, because `PYBIND11_SIMPLE_GIL_MANAGEMENT` is always defined from the command line, but when monkey-patching common.h locally, it matters. * if process.exitcode is None: assert t_delta > 9.9 * More sophisiticated `_run_in_process()` implementation, clearly reporting `DEADLOCK`, additionally exercised via added `intentional_deadlock()` * Wrap m.intentional_deadlock in a Python function, for `ForkingPickler` compatibility. ``` > ForkingPickler(file, protocol).dump(obj) E TypeError: cannot pickle 'PyCapsule' object ``` Observed with all Windows builds including mingw but not PyPy, and macos-latest with Python 3.9, 3.10, 3.11 but not 3.6. * Add link to potential solution for WOULD-BE-NICE-TO-HAVE feature. * Add `SKIP_IF_DEADLOCK = True` option, to not pollute the CI results with expected `DEADLOCK` failures while we figure out what to do about them. * Add COPY-PASTE-THIS: gdb ... command (to be used for debugging the detected deadlock) * style: pre-commit fixes * Do better than automatic pre-commit fixes. * Add `PYBIND11_SIMPLE_GIL_MANAGEMENT` to `pytest_report_header()` (so that we can easily know when harvesting deadlock information from the CI logs). Co-authored-by: Arnim Balzer <[email protected]> Co-authored-by: Henry Schreiner <[email protected]> Co-authored-by: Ralf W. Grosse-Kunstleve <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* Add option to force the use of the PYPY GIL scoped acquire/release logic to support nested gil access, see #1276 and pytorch/pytorch#83101 * Apply suggestions from code review * Update CMakeLists.txt * docs: update upgrade guide * Update docs/upgrade.rst * All bells & whistles. * Add Reminder to common.h, so that we will not forget to purge `!WITH_THREAD` branches when dropping Python 3.6 * New sentence instead of semicolon. * Temporarily pull in snapshot of PR #4246 * Add `test_release_acquire` * Add more unit tests for nested gil locking * Add test_report_builtins_internals_keys * Very minor enhancement: sort list only after filtering. * Revert change in docs/upgrade.rst * Add test_multi_acquire_release_cross_module, while also forcing unique PYBIND11_INTERNALS_VERSION for cross_module_gil_utils.cpp * Hopefully fix apparently new ICC error. ``` 2022-10-28T07:57:54.5187728Z -- The CXX compiler identification is Intel 2021.7.0.20220726 ... 2022-10-28T07:58:53.6758994Z icpc: remark #10441: The Intel(R) C++ Compiler Classic (ICC) is deprecated and will be removed from product release in the second half of 2023. The Intel(R) oneAPI DPC++/C++ Compiler (ICX) is the recommended compiler moving forward. Please transition to use this compiler. Use '-diag-disable=10441' to disable this message. 2022-10-28T07:58:54.5801597Z In file included from /home/runner/work/pybind11/pybind11/include/pybind11/detail/../detail/type_caster_base.h(15), 2022-10-28T07:58:54.5803794Z from /home/runner/work/pybind11/pybind11/include/pybind11/detail/../cast.h(15), 2022-10-28T07:58:54.5805740Z from /home/runner/work/pybind11/pybind11/include/pybind11/detail/../attr.h(14), 2022-10-28T07:58:54.5809556Z from /home/runner/work/pybind11/pybind11/include/pybind11/detail/class.h(12), 2022-10-28T07:58:54.5812154Z from /home/runner/work/pybind11/pybind11/include/pybind11/pybind11.h(13), 2022-10-28T07:58:54.5948523Z from /home/runner/work/pybind11/pybind11/tests/cross_module_gil_utils.cpp(13): 2022-10-28T07:58:54.5949009Z /home/runner/work/pybind11/pybind11/include/pybind11/detail/../detail/internals.h(177): error #2282: unrecognized GCC pragma 2022-10-28T07:58:54.5949374Z PYBIND11_TLS_KEY_INIT(tstate) 2022-10-28T07:58:54.5949579Z ^ 2022-10-28T07:58:54.5949695Z ``` * clang-tidy fixes * Workaround for PYPY WIN exitcode None * Revert "Temporarily pull in snapshot of PR #4246" This reverts commit 23ac16e. * Another workaround for PYPY WIN exitcode None * Clean up how the tests are run "run in process" Part 1: uniformity * Clean up how the tests are run "run in process" Part 2: use `@pytest.mark.parametrize` and clean up the naming. * Skip some tests `#if defined(THREAD_SANITIZER)` (tested with TSAN using the Google-internal toolchain). * Run all tests again but ignore ThreadSanitizer exitcode 66 (this is less likely to mask unrelated ThreadSanitizer issues in the future). * bug fix: missing common.h include before using `PYBIND11_SIMPLE_GIL_MANAGEMENT` For the tests in the github CI this does not matter, because `PYBIND11_SIMPLE_GIL_MANAGEMENT` is always defined from the command line, but when monkey-patching common.h locally, it matters. * if process.exitcode is None: assert t_delta > 9.9 * More sophisiticated `_run_in_process()` implementation, clearly reporting `DEADLOCK`, additionally exercised via added `intentional_deadlock()` * Wrap m.intentional_deadlock in a Python function, for `ForkingPickler` compatibility. ``` > ForkingPickler(file, protocol).dump(obj) E TypeError: cannot pickle 'PyCapsule' object ``` Observed with all Windows builds including mingw but not PyPy, and macos-latest with Python 3.9, 3.10, 3.11 but not 3.6. * Add link to potential solution for WOULD-BE-NICE-TO-HAVE feature. * Add `SKIP_IF_DEADLOCK = True` option, to not pollute the CI results with expected `DEADLOCK` failures while we figure out what to do about them. * Add COPY-PASTE-THIS: gdb ... command (to be used for debugging the detected deadlock) * style: pre-commit fixes * Do better than automatic pre-commit fixes. * Add `PYBIND11_SIMPLE_GIL_MANAGEMENT` to `pytest_report_header()` (so that we can easily know when harvesting deadlock information from the CI logs). Co-authored-by: Arnim Balzer <[email protected]> Co-authored-by: Henry Schreiner <[email protected]> Co-authored-by: Ralf W. Grosse-Kunstleve <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
I think the work on |
I couldn't find evidence that
That's interesting. Do you have a reliable reproducer? Every time I took a closer look before, such observations turned out to be chance. |
Unfortunately I don't have an isolated test case to provide for this. We've observed crashes in a large application with an embedded Python interpreter, which already manages GIL state using the same Python C API primitives as |
Assume the following scenario:
Here is the minimum working example:
When the inner nested
gil_scoped_release
is destructed, it causes the Python interpreter to throw a fatal error:because there are two different thread states for the same thread, one created by the pythread and one created by pybind.
The text was updated successfully, but these errors were encountered: