Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-98275: Clarify that asyncio.Task.cancel() does not throw CancelledError into not-yet-started coroutine. #98321

Closed
wants to merge 2 commits into from

Conversation

fancidev
Copy link
Contributor

@fancidev fancidev commented Oct 16, 2022

@bedevere-bot bedevere-bot added awaiting review docs Documentation in the Doc dir skip news labels Oct 16, 2022
@kumaraditya303 kumaraditya303 added needs backport to 3.10 only security fixes needs backport to 3.11 only security fixes labels Oct 16, 2022
Copy link
Member

@gvanrossum gvanrossum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this isn't creating a false distinction. Every coroutine (and every generator) is initially created in a state where it has executed no code of its body, and by definition if an exception is thrown into it, there is no try/except statement active to catch that exception, so the exception will terminate the call immediately. Surrounding the entire body with try/except does not help because the coroutine is stopped before the first try block is even entered.

This is not unique to cancellation. Perhaps it would be better to update the docs for create_task to explain more carefully when a task is started? Because arguably the issue with cancellation is due to confusion about that -- create_task does not start the task, it just queues it for being started when the scheduler next runs.

@fancidev
Copy link
Contributor Author

It seems the confusion lies in that the term “coroutine” may refer to the coroutine object or its (code) body, depending on the context:

  • If “thrown into the coroutine” refers to the coroutine body, then this PR clarifies this doesn’t always happen.

  • If “thrown into the coroutine” refers to the coroutine object, then the original documentation is technically correct, though the word “into” is confusing and the example following it seems misleading.

Therefore I think this PR does clarify things and does not make a false distinction.

I will make a few more changes to:

  • specify what happens if cancel() is called on a done task: no-op

  • specify what happens if cancel() is called from within the last step of the coroutine: the task is still considered cancelled and the original result discarded (need to double check)

As for create_task, the current documentation says it schedules the task, which is accurate. I may add a line to clarify when will it be started, but in a separate PR.

Copy link
Member

@gvanrossum gvanrossum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW I didn't mean to approve yet. Waiting for your update.

@bedevere-bot
Copy link

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

@fancidev
Copy link
Contributor Author

I have made the requested changes; please review again.

@bedevere-bot
Copy link

Thanks for making the requested changes!

@gvanrossum: please review the changes made to this pull request.

Comment on lines +1141 to +1147
- If :meth:`cancel` is called from the wrapped coroutine
which then raises an exception, the Task is *done*
with that exception and the cancellation has no effect.

- If :meth:`cancel` is called from the wrapped coroutine
which then returns a result, that result is discarded and
the Task is *cancelled*.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. I didn't know there are two special cases for when a coroutine self-cancels. Where in the code do you see this? If you have just experienced it in a simple test program, could it be possible that there's a simpler rule to explain your observations?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code that discards a result if self cancelled is here:

https://github.com/python/cpython/blob/main/Lib/asyncio/tasks.py#L274

and there is indeed a test case for this behavior:

https://github.com/python/cpython/blob/main/Lib/test/test_asyncio/test_tasks.py#L924

The code that discards the cancellation request and stores the exception is here:

https://github.com/python/cpython/blob/main/Lib/asyncio/tasks.py#L288

but I didn’t find a test case for this specific case.

As for whether the “special” cases could be factorized into more primitive rules, we might say that an exception raised by the coroutine is always “honored”, which makes the second rule redundant. Though I find it clearer to state the case explicitly to remind the reader (and implementer).

The first “special” case might be rephrased into “cancel always lead to CancelledError unless the coroutine swallowed the CancelledError or raised a different exception”. But I find that more confusing than stating the “special” case plainly.

I think the many branches of behavior of cancel is due to cancellation request not guaranteed to occur at suspension point. Coroutine entry and coroutine exit are not suspension points, but a Task may be canceled there (even if the coroutine ran to completion in the latter case). This cancellation behavior seems implementation specific, and is e.g. different from thread cancellation (never do that) or token-based cancellation (fully “cooperative”) semantics.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code that discards a result if self cancelled is here:

https://github.com/python/cpython/blob/main/Lib/asyncio/tasks.py#L274

and there is indeed a test case for this behavior:

https://github.com/python/cpython/blob/main/Lib/test/test_asyncio/test_tasks.py#L924

Oh, that's interesting. I thought that this situation could also happen another way, but it seems that if _must_cancel is already set when __step is entered, it is cleared (and we ensure that exc is a CancelledError instance).

I still find the phrase "from the wrapped coroutine" confusing though, because the actual cancel() call could be anywhere as long as the code containing it is called from the coroutine.

And I'm also still not sure we need to call out this edge case. Just like the "cancel before task is started" edge case, this can be interpreted as throwing an exception into code that (by definition) cannot handle it, so it bubbles right out, just as if there just wasn't a try/except around an await call that receives the cancellation.

The code that discards the cancellation request and stores the exception is here:

https://github.com/python/cpython/blob/main/Lib/asyncio/tasks.py#L288

but I didn’t find a test case for this specific case.

Again, the cancellation is received only after the coroutine terminates so it has no chance to catch it.

As for whether the “special” cases could be factorized into more primitive rules, we might say that an exception raised by the coroutine is always “honored”, which makes the second rule redundant. Though I find it clearer to state the case explicitly to remind the reader (and implementer).

The first “special” case might be rephrased into “cancel always lead to CancelledError unless the coroutine swallowed the CancelledError or raised a different exception”. But I find that more confusing than stating the “special” case plainly.

I think the many branches of behavior of cancel is due to cancellation request not guaranteed to occur at suspension point. Coroutine entry and coroutine exit are not suspension points, but a Task may be canceled there (even if the coroutine ran to completion in the latter case). This cancellation behavior seems implementation specific, and is e.g. different from thread cancellation (never do that) or token-based cancellation (fully “cooperative”) semantics.

Ah, maybe this is the basis of our disagreement: I see coroutine entry as a definite suspension point (and that's how it's implemented). And coroutine exit might as well be a suspension point. All this talk about thread cancellation or "implementation specific" doesn't help -- every asyncio implementation must follow these rules, these are specified semantics for asyncio cancellation.

I see a coroutine as a series of chunks of code. The chunks are separated by await. There's an implicit await before the first chunk. After the last chunk there still is a "suspension" -- the event loop resumes execution at that point (possibly running other callbacks becore eventually blockin g for I/O again).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By “suspension point”, I mean if coroutine f await g() where g is a coroutine, the code flow doesn’t yield when g is entered nor when g is exited. Therefore entry and exit are not suspension points of a coroutine.

The fact that create_task(g()) makes g “yield” to the event loop before it starts and after it exits is a property of the Task, not a property of the coroutine itself. For example, it is possible to write a Task implementation where the entry (exit) of g is not a suspension point if called from (returning into) a coroutine f.

Since the documentation is more user-oriented (rather than library author oriented), I believe it is helpful to call out the case of cancellation before task starts.

As for self-cancellation, I think a (probably better) alternative is to replace the two bullet points with the following:

“To cancel a running Task from within the coroutine it wraps, raise CancelledError directly.”

So that user doesn’t need to know or care about the handling of self cancellation, which is somewhat technical and obscure (I find).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By “suspension point”, I mean if coroutine f await g() where g is a coroutine, the code flow doesn’t yield when g is entered nor when g is exited. Therefore entry and exit are not suspension points of a coroutine.

But if you write it slightly differently,

x = g()
await x

then the code of g() doesn't start running until the await.

Frankly, I am tired of arguing about this, and I am just going to close your PR. Sorry.

@gvanrossum gvanrossum changed the title gh-98275: Clarify that asyncio.Task.cancel() does not throw CancelledError into not-yet-started coroutine. GH-98275: Clarify that asyncio.Task.cancel() does not throw CancelledError into not-yet-started coroutine. Oct 22, 2022
@gvanrossum gvanrossum closed this Oct 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting change review docs Documentation in the Doc dir needs backport to 3.10 only security fixes needs backport to 3.11 only security fixes skip news
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants