Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PyThreadState_Swap() During Finalization Causes Immediate Exit (AKA Daemon Threads Are Still the Worst!) #109793

Closed
ericsnowcurrently opened this issue Sep 23, 2023 · 1 comment
Assignees
Labels
3.12 bugs and security fixes 3.13 bugs and security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) topic-subinterpreters type-bug An unexpected behavior, bug, or error

Comments

@ericsnowcurrently
Copy link
Member

ericsnowcurrently commented Sep 23, 2023

Bug report

tl;dr Switching between interpreters while finalizing causes the main thread to exit. The fix should be simple.

We use PyThreadState_Swap() to switch between interpreters. That function almost immediately calls _PyEval_AcquireLock(). During finalization, _PyEval_AcquireLock() immediately causes the thread to exit if the current thread state doesn't match the one that was active when Py_FinalizeEx() was called.

Thus, if we switch interpreters during finalization then the thread will exit. If we do this in the finalizing (main) thread then the process immediately exits with an exit code of 0.

One notable consequence is that a Python process with an unhandled exception will print the traceback like normal but can end up with an exit code of 0 instead of 1 (and some of the runtime finalization code never gets executed). 1

Reproducer

$ cat > script.py << EOF
import _xxsubinterpreters as _interpreters
interpid = _interpreters.create()
raise Exception
EOF
$ ./python script.py
Traceback (most recent call last):
  File ".../check-swapped-exitcode.py", line 3, in <module>
    raise Exception
Exception
$ echo $?
0

In this case, "interpid" is a PyInterpreterIDObject bound to the __main__ module (of the main interpreter). It is still bound there when the script ends and the executable starts finalizing the runtime by calling Py_FinalizeEx(). 2

Here's what happens in Py_FinalizeEx():

  1. wait for non-daemon threads to finish 3
  2. run any remaining pending calls belong to the main interpreter
  3. run at exit hooks
  4. mark the runtime as finalizing (storing the pointer to the current tstate, which belongs to the main interpreter)
  5. delete all other tstates belong to the main interpreter (i.e. all daemon threads)
  6. remove our custom signal handlers
  7. finalize the import state
  8. clean up sys.modules of the main interpreter (finalize_modules() in Python/pylifecycle.c)

At the point the following happens:

  1. the __main__ module is dealloc'ed
  2. "interpid" is dealloc'ed (PyInterpreterID_Type.tp_dealloc)
  3. _PyInterpreterState_IDDecref() is called, which finalizes the corresponding interpreter state
    4, before Py_EndInterpreter() is called, we call _PyThreadState_Swap() to switch to a tstate belonging to the subinterpreter
  4. that calls _PyEval_AcquireLock()
  5. that basically calls _PyThreadState_MustExit(), which sees that the current tstate pointer isn't the one we stored as "finalizing"
  6. it then calls PyThread_exit_thread(), which kills the main thread
  7. the process exits with an exitcode of 0

Notably, the rest of Py_FinalizeEx() (and Py_Main(), etc.) does not execute. main() never gets a chance to return an exitcode of 1.

Background

Runtime finalization happens in whichever thread called Py_FinalizeEx() and happens relative to whichever PyThreadState is active there. This is typically the main thread and the main interpreter.

Other threads may still be running when we start finalization, whether daemon threads or not, and each of those threads has a thread state corresponding to the interpreter that is active in that thread. 4 One of the first things we do during finalization is to wait for all non-daemon threads to finish running. Daemon threads are a different story. They must die!

Back in 2011 we identified that daemon threads were interfering with finalization, sometimes causing crashes or making the Python executable hang. 5 At the time, we applied a best-effort solution where we kill the current thread if it isn't the one where Py_FinalizeEx() was called.

However, that solution checked the tstate pointer rather than the thread ID, so swapping interpreters in the finalizing thread was broken, and here we are.

History:

Related: gh-87135 (PRs: gh-105805, gh-28525)

Linked PRs

Footnotes

  1. This may help explain why, when we re-run some tests in subprocesses, they aren't marked as failures even when they actually fail.

  2. Note that we did not create any extra threads; we stayed exclusively in the main thread. We also didn't even run any code in the subinterpreter.

  3. FYI, IIRC we used to abort right before this point if there were any subinterpreters around still.

  4. In any given OS thread, each interpreter has a distinct tstate. Each tstate (mostly) corresponds to exactly one OS thread.

  5. If a daemon thread keeps running and tries to access any objects or other runtime state then there's a decent chance of a crash.

@ericsnowcurrently ericsnowcurrently added type-bug An unexpected behavior, bug, or error interpreter-core (Objects, Python, Grammar, and Parser dirs) topic-subinterpreters 3.12 bugs and security fixes 3.13 bugs and security fixes labels Sep 23, 2023
@ericsnowcurrently ericsnowcurrently self-assigned this Sep 23, 2023
@ericsnowcurrently
Copy link
Member Author

CC @vstinner, @gpshead

ericsnowcurrently added a commit that referenced this issue Sep 27, 2023
Essentially, we should check the thread ID rather than the thread state pointer.
csm10495 pushed a commit to csm10495/cpython that referenced this issue Sep 28, 2023
…thongh-109794)

Essentially, we should check the thread ID rather than the thread state pointer.
ericsnowcurrently added a commit that referenced this issue Sep 29, 2023
@github-project-automation github-project-automation bot moved this from Todo to Done in Subinterpreters Oct 2, 2023
ericsnowcurrently added a commit to ericsnowcurrently/cpython that referenced this issue Oct 11, 2023
…thongh-109794)

Essentially, we should check the thread ID rather than the thread state pointer.
ericsnowcurrently added a commit that referenced this issue Nov 28, 2023
…h-109794) (gh-110705)

Essentially, we should check the thread ID rather than the thread state pointer.
Glyphack pushed a commit to Glyphack/cpython that referenced this issue Sep 2, 2024
…thongh-109794)

Essentially, we should check the thread ID rather than the thread state pointer.
Glyphack pushed a commit to Glyphack/cpython that referenced this issue Sep 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.12 bugs and security fixes 3.13 bugs and security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) topic-subinterpreters type-bug An unexpected behavior, bug, or error
Projects
Status: Done
Development

No branches or pull requests

1 participant