-
-
Notifications
You must be signed in to change notification settings - Fork 530
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix problems with timeouts in graphql_transport_ws #2703
Fix problems with timeouts in graphql_transport_ws #2703
Conversation
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## main #2703 +/- ##
==========================================
+ Coverage 96.48% 96.49% +0.01%
==========================================
Files 194 197 +3
Lines 7988 8070 +82
Branches 1449 1457 +8
==========================================
+ Hits 7707 7787 +80
+ Misses 181 180 -1
- Partials 100 103 +3 |
20c1429
to
a7b1c88
Compare
Thanks for adding the Here's a preview of the changelog: This release improves the Here's the preview release card for twitter: Here's the tweet text:
|
Because the timeout is started as soon as the 'handle' method is called, it is possible for it to trigger before an 'accept' message is sent. This is valid Websockets, it will reject the connection. However, this causes a race with the app, which subsequently is trying to send a websockets accept message. This is likely what is causing the race condition here, and the client fails to shut down in some weird deadlock. This pr makes sure that the trigger is only started after the base websocket connection is accepted and then maintains a synchronized, and race-free state wrt the sub-protocol handshake. |
40b351d
to
265cf47
Compare
265cf47
to
c20bdc1
Compare
c20bdc1
to
3717320
Compare
strawberry/subscriptions/protocols/graphql_transport_ws/handlers.py
Outdated
Show resolved
Hide resolved
…rs.py Co-authored-by: Patrick Arminio <[email protected]>
Co-authored-by: Patrick Arminio <[email protected]>
One question: |
when are these errors happening? |
Never, hopefully. But whenever you create a background task, it is prudent to install a top-level error handler to catch and log it. If you don't, Python will create a warning about an task with an exception not being "awaited" but that warning may end up anywhere. In case of the timeout thread, there really aren't many things which can go wrong. But for the subscription thread, all kinds of errors can occur and it is best to handle them in-task by a top level error handler. The alternative is to have the main task "await" all background tasks and catch and log any errors which occur there. (asyncio.CancelledErrors don't need to be handled and are ignored if they are raised to the top, but all other errors will cause a warning somewhere) |
ok, let's add a log then! let's maybe do |
This fixes an issue mentioned in #2702
Description
After merging uniform websocket tests for Starlite extension, sporadic deadlocks were observed.
It turned out that the timeout trigger, part of the
graphql_transport_ws
protocol, could trigger too early,when the initial
websockets
handshake was still being done. This caused the whole websocket connectionattempt to be rejected and this triggered a deadlock in Starlite, which still appears a bit un-robust.
This PR does a few things
connection_timed_out:bool
) to ensure that there is never a race between a timeout and accepting a connection.Types of Changes
Issues Fixed or Closed by This PR
Checklist