-
Notifications
You must be signed in to change notification settings - Fork 273
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Router does not shut down until all connections are closed #3124
Comments
I took an initial look. The connection graceful shutdown is happening, and the listeners are stopped correctly, so it is happening elsewhere |
revising my statement: there is indeed an issue elsewhere (the connection sender being cloned and held in the wrong place), for whichI have a fix nearly ready. But when opening the sandbox, two connections are created, and only one of them stops after getting the graceful shutdown notification |
more context here:
This happens with or without compression, with a content-length or chunking response. From the point of view of the client, the entire response has been received. In graceful shutdown on HTTP 1, hyper will either wait for the response to be sent to close the connection, or if the connection is idle (waiting for a new request), it will close it immediately. Somehow here the introspection connection must not be considered idle. Before we can get further into the investigation, I think we could add a configurable timeout on the connection shutdown. Short in dev mode, a bit longer in production. |
I had a little look at this and couldn't reproduce the hang. I tried with Am I missing something or may this have been fixed? |
Looks like this got fixed at some point in the last couple of releases. Let's close for now. |
Reopening this issue since we have reproduced it again. |
Flagging that a customer I'm working with is running into this currently; OS: MacOS 13.4/Docker Config: supergraph:
listen: 0.0.0.0:4000
introspection: true
include_subgraph_errors:
all: true
cors:
allow_any_origin: true
origins:
- https://studio.apollographql.com
sandbox:
enabled: true
homepage:
enabled: false Which is very basic. They were getting the following logs when they had the playground window open:
Refreshing the chrome window with the playground let the router close. |
It might be related or not, but I have another customer who is reporting that starting a Router with a subgraph and then repeatedly restarting the subgraph is also causing Router resources to go up. I have yet been able to create a reproduce-able example This is on the other side of the connection but I wonder if it is related |
Fix #3124 Fix #3941 <!-- start metadata --> --- **Checklist** Complete the checklist (and note appropriate exceptions) before the PR is marked ready-for-review. - [ ] Changes are compatible[^1] - [ ] Documentation[^2] completed - [ ] Performance impact assessed and acceptable - Tests added and passing[^3] - [ ] Unit Tests - [ ] Integration Tests - [ ] Manual Tests **Exceptions** *Note any exceptions here* **Notes** [^1]: It may be appropriate to bring upcoming changes to the attention of other (impacted) groups. Please endeavour to do this before seeking PR approval. The mechanism for doing this will vary considerably, so use your judgement as to how and when to do this. [^2]: Configuration is an important part of many changes. Where applicable please try to document configuration examples. [^3]: Tick whichever testing boxes are applicable. If you are adding Manual Tests, please document the manual testing (extensively) in the Exceptions. --------- Co-authored-by: Bryn Cooke <[email protected]> Co-authored-by: Gary Pennington <[email protected]>
To reproduce:
Why we need this:
We need to:
traffic_shaping->timeout
is able to terminate in flight requests if they have not completed and allow the router to shut down. New requests should be blocked once shutdown has been initiated.The text was updated successfully, but these errors were encountered: