Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

graphql_transport_ws: Fix task shutdown/cancellation issues evident from testsuite warnings #2717

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

kristjanvalur
Copy link
Contributor

@kristjanvalur kristjanvalur commented Apr 19, 2023

Examining the warnings after running the unittests for test_graphql_transport_ws.py
showed some problems in how we were handling task cleanup and cancellation.

Description

  • AsyncGenerator doesn't need to be explicitly closed, but for expedient shutdown it should be done from witin
    within the task, using Finally semantics.
  • The current Task instance should be retrieved at the start, not during shutdown when the event loop may be gone.
  • CancelledError should raise out of the task.
  • The cleanup of current operation id can race with an external cleanup_operation so must not fail if the id isn't on record.
  • Added a unittest to verify that the subscription AsyncGenerators are properly finalized when an operation is cancelled.

As this is not a feature enhancement, a Release file is probably not warranted. Also, this PR belongs in the group of
"subscription enhancements" that I am working on.

Types of Changes

  • Core
  • Bugfix
  • New feature
  • Enhancement/optimization
  • Documentation

Issues Fixed or Closed by This PR

Checklist

  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have read the CONTRIBUTING document.
  • I have added tests to cover my changes.
  • I have tested the changes and verified that they work and don't break anything (as well as I can manage).

@codecov
Copy link

codecov bot commented Apr 19, 2023

Codecov Report

Merging #2717 (eed671e) into main (19f8b0d) will decrease coverage by 0.03%.
The diff coverage is 100.00%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2717      +/-   ##
==========================================
- Coverage   96.49%   96.47%   -0.03%     
==========================================
  Files         197      197              
  Lines        8070     8080      +10     
  Branches     1457     1463       +6     
==========================================
+ Hits         7787     7795       +8     
- Misses        180      181       +1     
- Partials      103      104       +1     

@kristjanvalur kristjanvalur marked this pull request as ready for review April 19, 2023 14:34
@botberry
Copy link
Member

botberry commented Apr 19, 2023

Thanks for adding the RELEASE.md file!

Here's a preview of the changelog:


Minor adjustments for graphql-transport-ws and tests to reduce warnings when running unit tests.


Here's the preview release card for twitter:

Here's the tweet text:

🆕 Release (next) is out! Thanks to @kristjanvalur for the PR 👏

Get it here 👉 https://github.com/strawberry-graphql/strawberry/releases/tag/(next)

@kristjanvalur kristjanvalur force-pushed the kristjan/unittest-warnings branch 2 times, most recently from edf69d3 to a3264a0 Compare April 20, 2023 20:19
@kristjanvalur kristjanvalur changed the title Fix task shutdown/cancellation issues evident from testsuite warnings graphql_transport_ws: Fix task shutdown/cancellation issues evident from testsuite warnings Apr 20, 2023
Copy link
Member

@patrick91 patrick91 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've left some comments, I'm not super familiar with the WS work, though, so I'd wait for @DoctorJohn for a final review.

Oh and since we are changing running code, we should make a release 😊

Comment on lines 36 to 47
# loop cleanup code, similar to code from asyncio.run()
# disabled, can be enabled for debugging
return

loop = asyncio.get_running_loop()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we put this under a if DEBUG_ENABLED flag? just to make it clearer that it can be used for that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. The thinking here is that if we are having strange errors/warnings in the unittests, enabling this code to forcefully cleanup tasks after each test fixture should help with isolating the problem. I can also expand on the comment.


# wait until context is dead.
# We don't know exactly how many packets will arrive or how long it will take.
# Need manual timeout because async timeout doesn't work on async integrations
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need manual timeout because async timeout doesn't work on async integrations

how come?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the actual IO calls are actually blocking, and the event loop running the test will stop until the call returns, so no "async" timeouts can be delivered. And not all of the integrations support a "timeout" parameter to their receive methods.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That said, maybe I can make this test better... I'll have another go at this.

@kristjanvalur
Copy link
Contributor Author

Let's not review this any more until I have had time to clean it up a bit.

@patrick91
Copy link
Member

@kristjanvalur is this ready for a review?

@kristjanvalur
Copy link
Contributor Author

I think, since this is a test suite pr, that I'll roll in a fix to the timeout failure for the cancellation test which sometimes shows up. Allow until tomorrow please for me, I have a change to the unit test in mind which doesn't require timeout checks, they are doomed to occasionally fail in a virtualize environment.

@kristjanvalur
Copy link
Contributor Author

Ok, I found that the problems with the finalizer termination tests are hard to solve right now. The problem is that the iterator/generator created by graphql-core, for iterating over subscriptions, has such weird semantics. The finalizer is actually called on a Task, and may get cancelled during execution. I cannot make good tests for this until we adopt something like PR #2724

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants