Skip to content
This repository has been archived by the owner on Nov 23, 2017. It is now read-only.

The documentation on task cancellation is unclear #253

Open
realazthat opened this issue Jul 8, 2015 · 15 comments
Open

The documentation on task cancellation is unclear #253

realazthat opened this issue Jul 8, 2015 · 15 comments

Comments

@realazthat
Copy link

Here is an example:

@asyncio.coroutine
def something(sleep,marker):

    try:
        print ('something() is sleeping for %s  seconds!' % sleep,flush=True)
        yield from asyncio.sleep(sleep)
    finally:
        print ('cleaning something()!',flush=True)
        marker['cleaned'] = True

@asyncio.coroutine
def test_session2():
    marker = {'cleaned': False}
    yield from asyncio.wait_for(something(5, marker), timeout=10)
    #here something is cleaned up

    try:
        marker = {'cleaned': False}
        yield from asyncio.wait_for(something(10, marker), timeout=5)
    except asyncio.TimeoutError:
        print ('something() has timed out')
        pass
    #here something is not cancelled yet ... it cancels later.
    #intuitively, you'd think that wait_for would throw after something() is actually cancelled
    print ('is something cleaned up?', ('YES!' if marker['cleaned'] else 'NO!'))
    print ('sleeping a bit')
    yield from asyncio.sleep(1)
    print ('is something cleaned up?', ('YES!' if marker['cleaned'] else 'NO!'))

loop = asyncio.get_event_loop()

loop.run_until_complete(test_session2())

And the output:

$ python3 asyncio.wait_for.test.py
something() is sleeping for 5  seconds!
cleaning something()!
something() is sleeping for 10  seconds!
something() has timed out
is something cleaned up? NO!
sleeping a bit
cleaning something()!
is something cleaned up? YES!

So wait_for() throws TimeoutError, but the coroutine is not yet cancelled.

Basically, I think that the task should first be cancelled, and then wait_for() should return.

OR.

It should be more explicitly documented that the task will cancel, but not necessarily before wait_for() raises an error.

@gvanrossum
Copy link
Member

I don't think this is violating the specification of wait_for(). Also, your example is way too complicated for anyone to follow. If you really think there is a bug in asyncio please provide a simpler example, and show the exact flow through your code rather than trying to guess what happened based on assertions.

@realazthat
Copy link
Author

@gvanrossum I've updated the example to something simpler.

@realazthat
Copy link
Author

Also, as a workaround, I've implemented my own wait_for() wrapper around asyncio.wait_for(), which does the same thing except wait until the task is cancelled before raising TimeoutError.

def wait_for(coro,timeout,*,loop=None):
    if not isinstance(coro, asyncio.Future):
        coro = asyncio.Task(coro,loop=loop)


    completed_future = asyncio.Future(loop=loop)

    @asyncio.coroutine
    def wrapper():
        try:
            return (yield from coro)
        finally:
            completed_future.set_result(None)
    try:
        return (yield from asyncio.wait_for(wrapper(),timeout=timeout,loop=loop))
    except asyncio.TimeoutError:
        if not coro.cancelled():
            coro.cancel()
        yield from completed_future
        raise

@gvanrossum
Copy link
Member

OK, your workaround helped me understand what's going on. I don't think we can add this behavior to wait_for() -- it is totally legal (if uncommon) for a task to absorb a cancellation (thrown in) and take its time (waiting for something else) before actually returning, or even simply ignoring the cancellation altogether. In any case a task's response to cancellation (as opposed to a plain Future's) is asynchronous. I don't think wait_for() should be required to wait any further when the timeout is reached. So I'm closing this since it's a feature, not a bug.

@vstinner
Copy link
Member

vstinner commented Jul 9, 2015

If you believe that the documentation is uncomplete about cancellation, please propose a new text, and I will update the doc.

Currently, the doc says "If the wait is cancelled, the future fut is also cancelled."

Maybe we need to enhance the documentation to explain better how tasks handle cancellation. For example, add a section and add a link from wait_for() doc to this new section? Currently, the explanation of task cancellation lives at:
https://docs.python.org/dev/library/asyncio-task.html#asyncio.Task

@realazthat
Copy link
Author

Returns result of the Future or coroutine. When a timeout occurs, it cancels the task and raises asyncio.TimeoutError. To avoid the task cancellation, wrap it in shield().

If the wait is cancelled, the future fut is also cancelled.

Change that to:

Returns result of the Future or coroutine. When a timeout occurs, it cancels the task and raises asyncio.TimeoutError. To avoid the task cancellation, wrap it in shield().

Note that wait_for() does not wait until the task is cancelled.

If the wait is cancelled, the future fut is also cancelled.

@vstinner
Copy link
Member

vstinner commented Jul 9, 2015

Note that wait_for() does not wait until the task is cancelled.

This sentence is unclear, or wrong.

If a task is cancelled, wait_for() immediatly returns (raise CancelledError).

If wait_for() is used on a task and a task doesn't complete before the timeout, the task is cancelled. The problem is that the cancellation is only "scheduled": cancelled() returns False. It's counter intuitive.

A shorter example is: "task.cancel(); assert not task.cancelled()".

Ok, I now have a better understanding of the documentation issue.

@vstinner vstinner reopened this Jul 9, 2015
@vstinner vstinner changed the title wait_for() seems to not cancel the task, or cancels it in the wrong order. The documentation on task cancellation is unclear Jul 9, 2015
@realazthat
Copy link
Author

@Haypo

OK I'll rephrase:

Note that in the event of a timeout, wait_for() does not wait until the task is cancelled; it raises the asyncio.TimeoutError immediately, as this is the normal Task.cancel() behavior.

@achimnol
Copy link

achimnol commented Jan 10, 2017

I'd like to add another point for improved docs.
If a task is cancelled but the task has remaining await statements, it must be awaited again.
This may seem to be trivial and obvious for asyncio designers, but people may get lost because it's easy to think "cancel" as complete cancellation of a task instead of injecting asyncio.CancelledError into it. (Of course, cancellation is different from other types of exceptions as it marks the future status differently.)
I found this in aiohttp's documentation (see cleanup_background_tasks function in the example).

A minimal example to demonstrate its necessity:

import asyncio

async def something():
    try:
        print('do')
        await asyncio.sleep(1)
    except asyncio.CancelledError:
        print('cancelled')
    finally:
        print('finally')
        await asyncio.sleep(1)
        print('done')

async def cancel_it(task):
    await asyncio.sleep(0.5)
    task.cancel()
    await task  # if not awaited, above finally block would not finish.

if __name__ == '__main__':
    loop = asyncio.get_event_loop()
    t = loop.create_task(something())
    try:
        loop.run_until_complete(cancel_it(t))
    except KeyboardInterrupt:
        loop.stop()
    finally:
        loop.close()

I believe "await-after-cancel" must be explained in asyncio docs.

@gjcarneiro
Copy link

gjcarneiro commented Jan 10, 2017 via email

@achimnol
Copy link

Yes, in my example I'm using CancelledError as a signal for graceful shutdown.
In many realistic applications, I think this will be the normal case rather than just re-raising CancelledError.

@achimnol
Copy link

achimnol commented Jan 10, 2017

The point that confused me before some experience was:

  • Omitting await for cancelled tasks works fine as long as the tasks have no more awaits internally.
  • It is easy to think that it's not necessary since the code works; for example, most close() methods are not coroutines though recent discussions show that they should be coroutines.

I think the docs need to carefully guide the asyncio beginners not to fall into misunderstandings (like me...), by describing when to use cancellation, what it stands for whether a task absorbs CancelledError or re-raises, etc. The learning experience should be consistent no matter which asyncio-based library you start with.

@gjcarneiro
Copy link

Check the documentation

Request that this task cancel itself.
This arranges for a CancelledError to be thrown into the wrapped coroutine on the next cycle through the event loop. The coroutine then has a chance to clean up or even deny the request using try/except/finally.

You are catching CancelledError, thereby denying the task cancellation. That's why you need to wait for the task again, it's because it wasn't cancelled.

Not a bug.

@achimnol
Copy link

I'm not saying it's a bug but we should improve the docs.

The coroutine then has a chance to clean up or even deny the request using try/except/finally.

This part gives only a hint. I think it would be bettter to provide a concrete example to cancel tasks and await them to clean up because the readers without background knowledge may miss the await-again part.

@asmello
Copy link

asmello commented Aug 27, 2017

To make sure I got this: calling task.cancel() will asynchronously awake the task with a CancelledError. The cancellation point will be an await instance, since that is where the task yields back to the event loop.

async def coroutine():
    result = await work()  # task.cancel() is called while this task is stopped here
    process(result)  # never executes

async def main():
    loop = asyncio.get_event_loop()
    task = loop.create_task(coroutine)
    # ...
    task.cancel()

By default, this will directly abort the task as soon as it executes on a posterior cycle, as any exception generated within work() would.

async def coroutine():
    try:
        result = await work()
        process(result)
    except asyncio.CancelledError:
        pass
    # the task will end and be marked as done, but not as cancelled

However, by encapsulating the await with a try/except block, the exception behaves like a signal - you can do something with it or ignore it. The task may terminate by returning or raising a different exception, but it will only be marked as cancelled if the wrapped coroutine raises the CancelledError exception.

async def coroutine():
    try:
        result = await work()
    except asyncio.CancelledError:
        some_cleanup()
        raise  # safely terminates, but still gets marked as cancelled
    process(result)

I think some of the confusion stems from the fact that any await can be a cancellation point. If you want to gracefully exit, the try/except/finally block needs wrap the entire coroutine code, or at least all awaiting points. If you don't need any cleanup (i.e. the work is done, and you just need to inform the task), you are fine without any try/except; though if this is a normal termination, you may still want to catch the exception just to avoid having it marked as cancelled.

The aiohttp code is exceptional because the cleanup code itself calls other coroutines, so after cancelling you still need to await it one more time to give those an opportunity to execute. When this happens, it should be well documented, as ordinarily you'd expect the task to fully terminate by itself after the cancel() call.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants