Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Raise deferred KeyboardInterrupt out of run() after cancelling all tasks #1537

Draft
wants to merge 11 commits into
base: main
Choose a base branch
from
25 changes: 19 additions & 6 deletions docs/source/reference-lowlevel.rst
Original file line number Diff line number Diff line change
Expand Up @@ -226,11 +226,11 @@ Windows-specific API

.. function:: WaitForSingleObject(handle)
:async:

Async and cancellable variant of `WaitForSingleObject
<https://msdn.microsoft.com/en-us/library/windows/desktop/ms687032(v=vs.85).aspx>`__.
Windows only.

:arg handle:
A Win32 object handle, as a Python integer.
:raises OSError:
Expand Down Expand Up @@ -270,6 +270,8 @@ Trio tokens
.. autofunction:: current_trio_token


.. _ki-handling:

Safer KeyboardInterrupt handling
================================

Expand All @@ -281,10 +283,21 @@ correctness invariants. On the other, if the user accidentally writes
an infinite loop, we do want to be able to break out of that. Our
solution is to install a default signal handler which checks whether
it's safe to raise :exc:`KeyboardInterrupt` at the place where the
signal is received. If so, then we do; otherwise, we schedule a
:exc:`KeyboardInterrupt` to be delivered to the main task at the next
available opportunity (similar to how :exc:`~trio.Cancelled` is
delivered).
signal is received. If so, then we do. Otherwise, we cancel all tasks
and raise `KeyboardInterrupt` directly as the result of :func:`trio.run`.

.. note:: This behavior means it's not a good idea to try to catch
`KeyboardInterrupt` within a Trio task. Most Trio
programs are I/O-bound, so most interrupts will be received while
no task is running (because Trio is waiting for I/O). There's no
task that should obviously receive the interrupt in such cases, so
Trio doesn't raise it within a task at all: every task gets cancelled,
then `KeyboardInterrupt` is raised once that's complete.

If you want to handle Ctrl+C by doing something other than "cancel
all tasks", then you should use :func:`~trio.open_signal_receiver` to
install a handler for ``SIGINT``. If you do that, then Ctrl+C will
go to your handler, and it can do whatever it wants.

So that's great, but – how do we know whether we're in one of the
sensitive parts of the program or not?
Expand Down
42 changes: 42 additions & 0 deletions newsfragments/1537.breaking.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
:ref:`Sometimes <ki-handling>`, a Trio program receives an interrupt
signal (Ctrl+C) at a time when Python's default response (raising
`KeyboardInterrupt` immediately) might corrupt Trio's internal
state. Previously, Trio would handle this situation by raising the
`KeyboardInterrupt` at the next :ref:`checkpoint <checkpoints>` executed
by the main task (the one running the function you passed to :func:`trio.run`).
This was responsible for a lot of internal complexity and sometimes led to
surprising behavior.

With this release, such a "deferred" `KeyboardInterrupt` is handled in a
different way: Trio will first cancel all running tasks, then raise
`KeyboardInterrupt` directly out of the call to :func:`trio.run`.
The difference is relevant if you have code that tries to catch
`KeyboardInterrupt` within Trio. This was never entirely robust, but it
previously might have worked in many cases, whereas now it will never
catch the interrupt.

An example of code that mostly worked on previous releases, but won't
work on this release::

async def main():
try:
await trio.sleep_forever()
except KeyboardInterrupt:
print("interrupted")
trio.run(main)

The fix is to catch `KeyboardInterrupt` outside Trio::

async def main():
await trio.sleep_forever()
try:
trio.run(main)
except KeyboardInterrupt:
print("interrupted")

If that doesn't work for you (because you want to respond to
`KeyboardInterrupt` by doing something other than cancelling all
tasks), then you can start a task that uses
`trio.open_signal_receiver` to receive the interrupt signal ``SIGINT``
directly and handle it however you wish. Such a task takes precedence
over Trio's default interrupt handling.
4 changes: 4 additions & 0 deletions newsfragments/1537.removal.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
The abort function passed to :func:`~trio.lowlevel.wait_task_rescheduled`
now directly takes as argument the cancellation exception that should be
raised after a successful asynchronous cancellation. Previously, it took
a callable that would raise the exception when called.
5 changes: 5 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,11 @@ issue_format = "`#{issue} <https://github.com/python-trio/trio/issues/{issue}>`_
# Unfortunately there's no way to simply override
# tool.towncrier.type.misc.showcontent

[[tool.towncrier.type]]
directory = "breaking"
name = "Breaking Changes"
showcontent = true

[[tool.towncrier.type]]
directory = "feature"
name = "Features"
Expand Down
16 changes: 16 additions & 0 deletions trio/_core/_exceptions.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,9 @@
# coding: utf-8

import attr

from trio._util import NoPublicConstructor
from trio import _deprecate


class TrioInternalError(Exception):
Expand Down Expand Up @@ -66,6 +69,19 @@ class Cancelled(BaseException, metaclass=NoPublicConstructor):
def __str__(self):
return "Cancelled"

def __call__(self):
# If a Cancelled exception is passed to an old abort_fn that
# expects a raise_cancel callback, someone will eventually try
# to call the exception instead of raising it. Provide a
# deprecation warning and raise it instead.
_deprecate.warn_deprecated(
"wait_task_rescheduled's abort_fn taking a callback argument",
"0.16.0",
issue=1537,
instead="an exception argument",
)
raise self


class BusyResourceError(Exception):
"""Raised when a task attempts to use a resource that some other task is
Expand Down
4 changes: 2 additions & 2 deletions trio/_core/_io_kqueue.py
Original file line number Diff line number Diff line change
Expand Up @@ -99,8 +99,8 @@ async def wait_kevent(self, ident, filter, abort_func):
)
self._registered[key] = _core.current_task()

def abort(raise_cancel):
r = abort_func(raise_cancel)
def abort(exc):
r = abort_func(exc)
if r is _core.Abort.SUCCEEDED:
del self._registered[key]
return r
Expand Down
12 changes: 6 additions & 6 deletions trio/_core/_io_windows.py
Original file line number Diff line number Diff line change
Expand Up @@ -605,11 +605,11 @@ async def wait_overlapped(self, handle, lpOverlapped):
)
task = _core.current_task()
self._overlapped_waiters[lpOverlapped] = task
raise_cancel = None
cancel_exc = None

def abort(raise_cancel_):
nonlocal raise_cancel
raise_cancel = raise_cancel_
def abort(cancel_exc_):
nonlocal cancel_exc
cancel_exc = cancel_exc_
try:
_check(kernel32.CancelIoEx(handle, lpOverlapped))
except OSError as exc:
Expand Down Expand Up @@ -649,8 +649,8 @@ def abort(raise_cancel_):
# it will produce the right sorts of exceptions
code = ntdll.RtlNtStatusToDosError(lpOverlapped.Internal)
if code == ErrorCodes.ERROR_OPERATION_ABORTED:
if raise_cancel is not None:
raise_cancel()
if cancel_exc is not None:
raise cancel_exc
else:
# We didn't request this cancellation, so assume
# it happened due to the underlying handle being
Expand Down
62 changes: 13 additions & 49 deletions trio/_core/_run.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
# coding: utf-8

import functools
import itertools
import logging
Expand Down Expand Up @@ -828,8 +830,8 @@ async def _nested_child_finished(self, nested_child_exc):
# If we get cancelled (or have an exception injected, like
# KeyboardInterrupt), then save that, but still wait until our
# children finish.
def aborted(raise_cancel):
self._add_exc(capture(raise_cancel).error)
def aborted(exc):
self._add_exc(exc)
return Abort.FAILED

self._parent_waiting_in_aexit = True
Expand Down Expand Up @@ -1039,41 +1041,26 @@ def _activate_cancel_status(self, cancel_status):
if self._cancel_status.effectively_cancelled:
self._attempt_delivery_of_any_pending_cancel()

def _attempt_abort(self, raise_cancel):
def _attempt_abort(self, exc):
# Either the abort succeeds, in which case we will reschedule the
# task, or else it fails, in which case it will worry about
# rescheduling itself (hopefully eventually calling reraise to raise
# rescheduling itself (hopefully eventually raising
# the given exception, but not necessarily).
success = self._abort_func(raise_cancel)
success = self._abort_func(exc)
if type(success) is not Abort:
raise TrioInternalError("abort function must return Abort enum")
# We only attempt to abort once per blocking call, regardless of
# whether we succeeded or failed.
self._abort_func = None
if success is Abort.SUCCEEDED:
self._runner.reschedule(self, capture(raise_cancel))
self._runner.reschedule(self, Error(exc))

def _attempt_delivery_of_any_pending_cancel(self):
if self._abort_func is None:
return
if not self._cancel_status.effectively_cancelled:
return

def raise_cancel():
raise Cancelled._create()

self._attempt_abort(raise_cancel)

def _attempt_delivery_of_pending_ki(self):
assert self._runner.ki_pending
if self._abort_func is None:
return

def raise_cancel():
self._runner.ki_pending = False
raise KeyboardInterrupt

self._attempt_abort(raise_cancel)
self._attempt_abort(Cancelled._create())


################################################################
Expand Down Expand Up @@ -1411,33 +1398,16 @@ def current_trio_token(self):

ki_pending = attr.ib(default=False)

# deliver_ki is broke. Maybe move all the actual logic and state into
# RunToken, and we'll only have one instance per runner? But then we can't
# have a public constructor. Eh, but current_run_token() returning a
# unique object per run feels pretty nice. Maybe let's just go for it. And
# keep the class public so people can isinstance() it if they want.

# This gets called from signal context
def deliver_ki(self):
self.ki_pending = True
try:
self.entry_queue.run_sync_soon(self._deliver_ki_cb)
self.entry_queue.run_sync_soon(
self.system_nursery.cancel_scope.cancel
)
except RunFinishedError:
pass

def _deliver_ki_cb(self):
if not self.ki_pending:
return
# Can't happen because main_task and run_sync_soon_task are created at
# the same time -- so even if KI arrives before main_task is created,
# we won't get here until afterwards.
assert self.main_task is not None
if self.main_task_outcome is not None:
# We're already in the process of exiting -- leave ki_pending set
# and we'll check it again on our way out of run().
return
self.main_task._attempt_delivery_of_pending_ki()

################
# Quiescing
################
Expand Down Expand Up @@ -1845,10 +1815,6 @@ def run_impl(runner, async_fn, args):
elif type(msg) is WaitTaskRescheduled:
task._cancel_points += 1
task._abort_func = msg.abort_func
# KI is "outside" all cancel scopes, so check for it
# before checking for regular cancellation:
if runner.ki_pending and task is runner.main_task:
task._attempt_delivery_of_pending_ki()
task._attempt_delivery_of_any_pending_cancel()
elif type(msg) is PermanentlyDetachCoroutineObject:
# Pretend the task just exited with the given outcome
Expand Down Expand Up @@ -1963,9 +1929,7 @@ async def checkpoint_if_cancelled():

"""
task = current_task()
if task._cancel_status.effectively_cancelled or (
task is task._runner.main_task and task._runner.ki_pending
):
if task._cancel_status.effectively_cancelled:
await _core.checkpoint()
assert False # pragma: no cover
task._cancel_points += 1
Expand Down
55 changes: 30 additions & 25 deletions trio/_core/_traps.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
# coding: utf-8

# These are the only functions that ever yield back to the task runner.

import types
Expand Down Expand Up @@ -95,7 +97,7 @@ async def wait_task_rescheduled(abort_func):
timeout expiring). When this happens, the ``abort_func`` is called. Its
interface looks like::

def abort_func(raise_cancel):
def abort_func(exc):
...
return trio.lowlevel.Abort.SUCCEEDED # or FAILED

Expand All @@ -109,40 +111,43 @@ def abort_func(raise_cancel):
task can't be cancelled at this time, and still has to make sure that
"someone" eventually calls :func:`reschedule`.

At that point there are again two possibilities. You can simply ignore
the cancellation altogether: wait for the operation to complete and
then reschedule and continue as normal. (For example, this is what
:func:`trio.to_thread.run_sync` does if cancellation is disabled.)
The other possibility is that the ``abort_func`` does succeed in
cancelling the operation, but for some reason isn't able to report that
right away. (Example: on Windows, it's possible to request that an
async ("overlapped") I/O operation be cancelled, but this request is
*also* asynchronous – you don't find out until later whether the
operation was actually cancelled or not.) To report a delayed
cancellation, then you should reschedule the task yourself, and call
the ``raise_cancel`` callback passed to ``abort_func`` to raise a
:exc:`~trio.Cancelled` (or possibly :exc:`KeyboardInterrupt`) exception
into this task. Either of the approaches sketched below can work::
At that point there are again two possibilities. You can simply
ignore the cancellation altogether: wait for the operation to
complete and then reschedule and continue as normal. (For
example, this is what :func:`trio.to_thread.run_sync` does if
cancellation is disabled.) The other possibility is that the
``abort_func`` does succeed in cancelling the operation, but
for some reason isn't able to report that right away. (Example:
on Windows, it's possible to request that an async
("overlapped") I/O operation be cancelled, but this request is
*also* asynchronous – you don't find out until later whether
the operation was actually cancelled or not.) To report a
delayed cancellation, you should reschedule the task yourself,
and cause it to raise the exception ``exc`` that was passed to
``abort_func``. (Currently ``exc`` will always be a
`~trio.Cancelled` exception, but we may use this mechanism to
raise other exceptions in the future; you should raise whatever
you're given.) Either of the approaches sketched below can
work::

# Option 1:
# Catch the exception from raise_cancel and inject it into the task.
# Directly reschedule the task with the provided exception.
# (This is what Trio does automatically for you if you return
# Abort.SUCCEEDED.)
trio.lowlevel.reschedule(task, outcome.capture(raise_cancel))
trio.lowlevel.reschedule(task, outcome.Error(exc))

# Option 2:
# wait to be woken by "someone", and then decide whether to raise
# the error from inside the task.
outer_raise_cancel = None
def abort(inner_raise_cancel):
nonlocal outer_raise_cancel
outer_raise_cancel = inner_raise_cancel
outer_exc = None
def abort(inner_exc):
nonlocal outer_exc
outer_exc = inner_exc
TRY_TO_CANCEL_OPERATION()
return trio.lowlevel.Abort.FAILED
await wait_task_rescheduled(abort)
if OPERATION_WAS_SUCCESSFULLY_CANCELLED:
# raises the error
outer_raise_cancel()
raise outer_exc

In any case it's guaranteed that we only call the ``abort_func`` at most
once per call to :func:`wait_task_rescheduled`.
Expand Down Expand Up @@ -229,8 +234,8 @@ async def temporarily_detach_coroutine_object(abort_func):
detached task directly without going through
:func:`reattach_detached_coroutine_object`, which would be bad.)
Your ``abort_func`` should still arrange for whatever the coroutine
object is doing to be cancelled, and then reattach to Trio and call
the ``raise_cancel`` callback, if possible.
object is doing to be cancelled, and then reattach to Trio and raise
the exception it received, if possible.

Returns or raises whatever value or exception the new coroutine runner
uses to resume the coroutine.
Expand Down
Loading