-
-
Notifications
You must be signed in to change notification settings - Fork 30.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bpo-39877: Fix PyEval_RestoreThread() for daemon threads #18811
Conversation
Without this change, after Py_Finalize() has been called, daemon threads are not exited by exit_thread_if_finalizing(): exit_thread_if_finalizing() is not called. Instead, daemon threads loop on attempting to take the GIL :-( With this PR, daemon threads exit before attempting to take the GIL. Moreover, the PR better validates tstate which can help to detect misusage of modified functions of the C API like PyEval_RestoreThread(). |
@ericsnowcurrently @nanjekyejoannah @pablogsal @pitrou @methane @serhiy-storchaka: Would you mind to review this PR? In short, this PR fix the code to exit daemon threads after Py_Finalize() has been called. Currently, it's broken in multiple different ways. But it worked somehow before my latest refactoring (see https://bugs.python.org/issue39877 for the cause of the regression). Main changes:
I chose to include "cleanup" changes in the PR: harden checks on tstate. I prefer to change all code at once. It should help to detect future bugs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good on the principle, some comments below.
Misc/NEWS.d/next/Core and Builtins/2020-03-06-18-30-00.bpo-39877.bzd1y0.rst
Outdated
Show resolved
Hide resolved
Misc/NEWS.d/next/Core and Builtins/2020-03-06-18-30-00.bpo-39877.bzd1y0.rst
Outdated
Show resolved
Hide resolved
{ | ||
assert(!_PyMem_IsPtrFreed(tstate)); | ||
assert(!_PyMem_IsPtrFreed(tstate->interp)); | ||
return 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does this function always return 1?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's to be able to use it with assert(...). I like to use assert(), since it's removed in release mode. I'm using this pattern with _PyObject_CheckConsistency() and similar _PyXXX_CheckConsistency() functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If an assertion fails, the function displays the expression which is false with the expected line number. That's why the function does use assert() directly rather than returning a value.
Python/ceval.c
Outdated
tstate->interp->runtime to support calls from Python daemon threads. | ||
After Py_Finalize() has been called, tstate can be a dangling pointer: | ||
point to PyThreadState freed memory. */ | ||
_PyRuntimeState *runtime = &_PyRuntime; | ||
if (runtime->finalizing != NULL && !_Py_CURRENTLY_FINALIZING(runtime, tstate)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you're now accessing this without the GIL, should this field be manipulated only with atomic ops?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that it's worth it. If you consider that it should be done: can it be done in a separated PR after this one is merged? Does atomic operations provide any additional warranty compared to regular variables?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you're not using an atomic variable, you may be getting a partially written value (it's very unlikely for a pointer-sized value, but theoretically possible). In other words, the code here is probably incorrect (ThreadSanitizer may help detect this).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Misc/NEWS.d/next/Core and Builtins/2020-03-06-18-30-00.bpo-39877.bzd1y0.rst
Outdated
Show resolved
Hide resolved
When you're done making the requested changes, leave the comment: |
I have made the requested changes; please review again. |
Thanks for making the requested changes! @pitrou: please review the changes made to this pull request. |
* exit_thread_if_finalizing() does now access directly _PyRuntime variable, rather than using tstate->interp->runtime since tstate can be a dangling pointer after Py_Finalize() has been called. * exit_thread_if_finalizing() is now called *before* calling take_gil(). There is no need to protect the call with the GIL. * Add ensure_tstate_not_null() function to check that tstate is not NULL at runtime. Check tstate earlier. take_gil() is now responsible to check that tstate is not NULL. Cleanup: * PyEval_RestoreThread() no longer saves/restores errno: it's already done inside take_gil(). * PyEval_AcquireLock(), PyEval_AcquireThread(), PyEval_RestoreThread() and _PyEval_EvalFrameDefault() now check if tstate is valid with the new is_tstate_valid() function which uses _PyMem_IsPtrFreed().
@pitrou: _PyRuntime.finalizing is now atomic. Would you mind to review the updated PR? |
Thanks for the review @python/buildbot-owners Daemon threads should be a little bit more reliable during Python finalization with this change. |
Oops, I wanted to write: Thanks for the review @pitrou! (Firefox picked the wrong completion, sorry.) |
bpo-39877: Fix PyEval_RestoreThread() for daemon threads (pythonGH-18811)
variable, rather than using tstate->interp->runtime since tstate
can be a dangling pointer after Py_Finalize() has been called.
take_gil(). There is no need to protect the call with the GIL.
at runtime. Check tstate earlier. take_gil() is now responsible to
check that tstate is not NULL.
Cleanup:
done inside take_gil().
PyEval_RestoreThread() and _PyEval_EvalFrameDefault() now check if
tstate is valid with the new is_tstate_valid() function which uses
_PyMem_IsPtrFreed().
https://bugs.python.org/issue39877