-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(@apollo/gateway) Call unref
on the federated gateway's setInterval
.
#3105
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…rval`. When the Apollo Gateway is configured in so-called "managed mode", it uses a `setInterval` timer to periodically check for updates from the Apollo Graph Manager (also known as Apollo Engine). However, since the Timer's `unref` method wasn't invoked, the timer would continue to place tasks on the event loop, even when the server had been requested to be shutdown. By calling `unref`, Node.js switches the timer to an internal timer which doesn't block the event-loop. While we could alternatively use a `process.on` event to detect when a shutdown signal had been received and invoke the Gateway's `stop` method accordingly, this should be sufficient for this particular implementation. (As opposed to the implementation in `apollo-engine-reporting` which does exactly this since it needs to ensure that all queued traces are transmitted to the cloud prior to process exit.
abernix
added a commit
to apollographql/federation
that referenced
this pull request
Sep 4, 2020
…llographql/apollo-server#3105) * fix(@apollo/gateway) Call `unref` on the federated gateway's `setInterval`. When the Apollo Gateway is configured in so-called "managed mode", it uses a `setInterval` timer to periodically check for updates from the Apollo Graph Manager (also known as Apollo Engine). However, since the Timer's `unref` method wasn't invoked, the timer would continue to place tasks on the event loop, even when the server had been requested to be shutdown. By calling `unref`, Node.js switches the timer to an internal timer which doesn't block the event-loop. While we could alternatively use a `process.on` event to detect when a shutdown signal had been received and invoke the Gateway's `stop` method accordingly, this should be sufficient for this particular implementation. (As opposed to the implementation in `apollo-engine-reporting` which does exactly this since it needs to ensure that all queued traces are transmitted to the cloud prior to process exit. * Add CHANGELOG.md for apollographql/apollo-server#3105. Apollo-Orig-Commit-AS: apollographql/apollo-server@73019e9
glasser
added a commit
to apollographql/federation
that referenced
this pull request
Feb 9, 2021
The (not particularly documented) ApolloGateway.stop() method didn't reliably stop the gateway from polling. All it did was cancel a timeout. But if it is called while pollServices is in the middle of running, it would do nothing, and then pollServices would carry on and set another timeout. Better semantics would be for stop() to reliably stop polling: allow the current poll to complete, ensure that there will be no more polls, and then (async) return. This PR implements those semantics, by implementing an explicit state machine for ApolloGateway's polling. One reason that these bugs were able to survive is that calling stop() is often unnecessary. In apollographql/apollo-server#3105 we chose to `unref` the polling timeout to allow the Node process to exit if it's the only thing left on the event loop, instead of encouraging users of `ApolloGateway` to be responsible and call `stop()` themselves. While this may be reasonable when the gateway's lifecycle is merely a schema polling timer, we may have future changes to the gateway where proper lifecycle handling is more important. So this PR also moves away from the world where it's fine to not bother to explicitly stop the gateway. That said, in the common case, we don't want users to have to write gateway stopping code. It makes more sense for stopping the `ApolloGateway` to be the responsibility of `ApolloServer.stop()`, just as `ApolloServer.stop()` can trigger events in plugins. So in the recently-released Apollo Server v2.20, `ApolloServer.stop()` calls `ApolloGateway.stop()`. This should mean that in most cases, the missing `unref` shouldn't keep the process running, as long as you've run `ApolloServer.stop()` (and if you don't, it's likely that other parts of the server are keeping your process running). What if you're still running an old version of Apollo Server? For a bit of backwards compatibility, `ApolloGateway` detects if it appears to be connected to Apollo Server older than v2.18. If so, it continues to do the old unref(). If you're using v2.18 or v2.19, the upgrade to v2.20 should be pretty painless (it's mostly just minor bugfixes). If you're using ApolloGateway on its own for some reason, and this change causes your processes to hang on shutdown, adding a `stop()` call should be pretty straightforward. (If we learn that this change really is devastating, we can always go back to an unconditional timer.unref() later.) Fixes #4428.
glasser
added a commit
to apollographql/federation
that referenced
this pull request
Feb 9, 2021
The (not particularly documented) ApolloGateway.stop() method didn't reliably stop the gateway from polling. All it did was cancel a timeout. But if it is called while pollServices is in the middle of running, it would do nothing, and then pollServices would carry on and set another timeout. Better semantics would be for stop() to reliably stop polling: allow the current poll to complete, ensure that there will be no more polls, and then (async) return. This PR implements those semantics, by implementing an explicit state machine for ApolloGateway's polling. One reason that these bugs were able to survive is that calling stop() is often unnecessary. In apollographql/apollo-server#3105 we chose to `unref` the polling timeout to allow the Node process to exit if it's the only thing left on the event loop, instead of encouraging users of `ApolloGateway` to be responsible and call `stop()` themselves. While this may be reasonable when the gateway's lifecycle is merely a schema polling timer, we may have future changes to the gateway where proper lifecycle handling is more important. So this PR also moves away from the world where it's fine to not bother to explicitly stop the gateway. That said, in the common case, we don't want users to have to write gateway stopping code. It makes more sense for stopping the `ApolloGateway` to be the responsibility of `ApolloServer.stop()`, just as `ApolloServer.stop()` can trigger events in plugins. So in the recently-released Apollo Server v2.20, `ApolloServer.stop()` calls `ApolloGateway.stop()`. This should mean that in most cases, the missing `unref` shouldn't keep the process running, as long as you've run `ApolloServer.stop()` (and if you don't, it's likely that other parts of the server are keeping your process running). What if you're still running an old version of Apollo Server? For a bit of backwards compatibility, `ApolloGateway` detects if it appears to be connected to Apollo Server older than v2.18. If so, it continues to do the old unref(). If you're using v2.18 or v2.19, the upgrade to v2.20 should be pretty painless (it's mostly just minor bugfixes). If you're using ApolloGateway on its own for some reason, and this change causes your processes to hang on shutdown, adding a `stop()` call should be pretty straightforward. (If we learn that this change really is devastating, we can always go back to an unconditional timer.unref() later.) Fixes apollographql/apollo-server#4428.
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
When the Apollo Gateway is configured in so-called "managed mode", it uses a
setInterval
timer to periodically check for updates from the Apollo GraphManager (also known as Apollo Engine).
However, since the Timer's
unref
method wasn't invoked, the timer wouldcontinue to place tasks on the event loop, even when the server had been
requested to be shutdown. By calling
unref
, Node.js switches the timer toan internal timer which doesn't block the event-loop.
While we could alternatively use a
process.on
event to detect when ashutdown signal had been received and invoke the Gateway's
stop
methodaccordingly, this should be sufficient for this particular implementation.
(As opposed to the implementation in
apollo-engine-reporting
which doesexactly this since it needs to ensure that all queued traces are transmitted
to the cloud prior to process exit.