-
-
Notifications
You must be signed in to change notification settings - Fork 30.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
race condition in ThreadChildWatcher (default) and MultiLoopChildWatcher #86276
Comments
## Test program ## import asyncio
import time
import os
import signal
import sys
# This bug happens with the default, ThreadedChildWatcher
# It also happens with MultiLoopChildWatcher,
# but not the other three watcher types.
#asyncio.set_child_watcher(asyncio.MultiLoopChildWatcher())
# Patch os.kill to call sleep(1) first,
# opening up the window for a race condition
os_kill = os.kill
def kill(p, n):
time.sleep(1)
os_kill(p, n)
os.kill = kill
async def main():
p = await asyncio.create_subprocess_exec(sys.executable, '-c', 'import sys; sys.exit(0)')
p.send_signal(signal.SIGTERM)
# cleanup
await p.wait()
asyncio.run(main()) ## Test output ## Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/alan-sysop/src/cpython/Lib/asyncio/runners.py", line 44, in run
return loop.run_until_complete(main)
File "/home/alan-sysop/src/cpython/Lib/asyncio/base_events.py", line 642, in run_until_complete
return future.result()
File "<stdin>", line 3, in main
File "/home/alan-sysop/src/cpython/Lib/asyncio/subprocess.py", line 138, in send_signal
self._transport.send_signal(signal)
File "/home/alan-sysop/src/cpython/Lib/asyncio/base_subprocess.py", line 146, in send_signal
self._proc.send_signal(signal)
File "/home/alan-sysop/src/cpython/Lib/subprocess.py", line 2081, in send_signal
os.kill(self.pid, sig)
File "<stdin>", line 3, in kill
ProcessLookupError: [Errno 3] No such process ## Tested versions ##
## Race condition ## main thread vs ThreadedChildWatcher._do_waitpid() thread p=create_subprocess_exec(...)
waitpid(...) # wait for process exit
<Process p exits>
<waitpid() returns. Process p is reaped, and no longer exists>
p.send_signal(9) ## Result ## A signal is sent to p.pid, after p.pid has been reaped by waitpid(). It might raise an error because p.pid no longer exists. In the worst case the signal will be sent successfully - because an unrelated process has started with the same PID. ## How easy is it to reproduce? ## It turns out the window for this race condition has been kept short, due to mitigations in the subprocess module. IIUC, the mitigation protects against incorrect parallel use of a subprocess object by multiple threads. def send_signal(self, sig):
# bpo-38630: Polling reduces the risk of sending a signal to the
# wrong process if the process completed, the Popen.returncode
# attribute is still None, and the pid has been reassigned
# (recycled) to a new different process. This race condition can
# happens in two cases [...]
self.poll()
if self.returncode is not None:
# Skip signalling a process that we know has already died.
return
os.kill(self.pid, sig) ## Possible workarounds ##
It would be possible to avoid the ThreadedChildWatcher race by using native code and pthread_cancel(), so that the corresponding waitpid() call is canceled before sending a signal. Except the current implementation of pthread_cancel() is also unsound, because of race conditions.
|
There's one way to fix this in MultiLoopChildWatcher (but not ThreadedChildWatcher). Make sure the waitpid() runs on the same thread that created the subprocess. Prototype: sourcejedi@92f979b The other approach would be to copy subprocess.wait(timeout) - keep waking up regularly and polling to see if the process has exited yet. I'm not convinced ThreadedChildWatcher is fixable. You can't call waitpid() in one thread, and let someone call kill() in a different thread. You could try avoiding calling waitpid() until someone does |
Put the other way, if you wanted to fix this bug in ThreadedChildWatcher, and go as far as allowing cancelling Process.wait(), followed by kill() / send_signal(), then I think you need -
|
I cannot reproduce the same bug. (Tested environment is 3.10.6) |
I don't know, the description is hard to follow, and child watchers are a minefield (e.g. #94597). |
I am unable to reproduce the reported race. FWIW |
But ThreadedChildWatcher is still used on macOS and other non-Linux UNIX variants. I think the OP is claiming that the race condition is in subprocess.py (not the asyncio version) and due to waiting for a process in a thread that didn't create the process. Maybe we can reproduce this outside asyncio? |
This was fixed by https://github.com/python/cpython/pull/20010/files, in particular the try/except ProcessLookupError around the os.kill() call added there. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: