gateway: Small improvements to error messages and behavior during updates. #3811

abernix · 2020-02-20T19:37:22Z

These are just a couple small improvements to the behavior of Apollo Gateway during error conditions which arise while attemtping to update the schema.

Most notably, this removes a previously trapped error which should be propogated up the call stack.

I believe these small bits are the very lowest hanging of fruit and that these updates are orthongonal/complimentary to additional retry functionality and improved usability of the Gateway which is something @trevor-scheer is currently looking at.

This message was being issued when a new server starts up, prior to ever having a schema, when a storage secret (or any other artifact) can't be fetched from GCS (or any error within `updateServiceDefinitions`) despite the fact that there may be no `previousSchema`. While I could use `previousSchema` to change the error message, I don't think that it's this methods concern to decide what happens in the event of an error. I think it's only this methods concern to actually do the update, if it is in fact successful.

Currently, we log the error and just return without throwing which causes `load` (one of two places where `updateComposition` is called to not actually fail and follow-up logic then suggests "Gateway successfully loaded schema", even though that cannot be true. The nice thing about this change (in addition to allowing someone to `catch` an error from `ApolloGateway.prototype.load`) is that this should bubble all the way up to the place where we currently call `load` within Apollo Server's constructor, and stop the server from starting in this failure condition: https://github.com/apollographql/apollo-server/blob/7dcee80ff33061c0911458d593ebbca5a9c73939/packages/apollo-server-core/src/ApolloServer.ts#L366 The other place we call `updateComposition` is within a `setTimeout`. That particular case is less troublesome since it will retry on the next interval.

No need to have them so far away from their usage and to even do these assignments at all when there are already escape routes that exit this code path.

* Refer to the thing we are processing consistently as "schema definitions". * Also make them generally more factual.

…an error.

Rely on `message` only if it's present, and fall back to serialization methods which might exist on the prototype otherwise (e.g. `toJSON`, `toString`). Also, switch to a single parameter usage of the logging facility. While `console.log` supports it, its certainly possible that the logger will _not_, and will need that positional parameter for something else.

While successful `updateComposition` invocations were working properly, failed invocations (including GCS access errors, network hiccups and just general configuration mistakes) were currently cluttering the logs with warnings of unhandled promise rejections. While those unhandled rejections did include the actual error messages, this adjusts them to be caught and logged in a structured manner with our `logger`, sparing our logs from those unnecessarily verbose (and scary!) messages.

Adjusting the typings only works for users of TypeScript. On the other hand, this is a one-time assertion which happens at instantiation time which could save a JavaScript developer from the pain of not knowing what's going on! There might be typing improvements to be had, but claiming it as an alternate approach to handling this isn't correct. The typings can still be improved, of course.

Overall, the success/failure behavior should be expected to be similar for all of these requests, as they all access the same GCS store.

Sometimes, the error object which is being caught here is in fact not a misconfiguration of server options, but rather an error which was thrown in a promise chain. By appending the "Invalid options provided to ApolloServer" string to the error object's `message` property in this method that is called on _every_ request to the server (`runHttpQuery`), we're risking appending the same prefix to the same error over and over (i.e. "a: a: a: a: a: b"). While this prefix may have been true at one point, it is no longer possible to enforce in a world where the schema is obtained asynchronously.

…veries. Previously, if the initial call to `load` failed, but its intention (fetching a federated schema configuration in managed federation) is eventually accomplished via a subsequent fetch (via setInterval polling), the `executor` would not be set. This resulted in a continued failure even if the `schema` was eventually set since federated `schema`s require the Gateway's `executor` to do pull off their much more complex (remote!) execution strategy! The solution was simple since `executor` was already present on the actual `ApolloGateway`, but that required exposing that property as a valid type to access from the interface that `ApolloGateway` implements: `GraphQLService`. I don't see why a `GraphQLService` wouldn't have an executor, so it seemed appropriate to add, particularly since our only `GraphQLService` is the `ApolloGateway` class itself.

In particular, this blocks any rejected promises which may come from GCS load prior to returning them to the `schemaDerivedData` promise (which is where the `initSchema` method assigns the result to). Failure to do this results in the server middleware sending the error directly to the client.

…ils. Another approach to this would be to throw an error here, but this is our only opportunity to allow the server to recover after initial failure. On the one hand, this approach is more graceful, but on the other hand, perhaps we actually want initial failure (after timeouts elapse) to not bind the other middleware. Either way, the server doesn't just `process.exit` right now, and with certain non-Node.js environments, it may be worthwhile to operate in this mode.

…cts. Because we want the actual executor to work when a schema might eventually become available, as it may when `onSchemaChange` hooks eventually succeed.

packages/apollo-gateway/src/__tests__/gateway/executor.test.ts

Co-Authored-By: Trevor Scheer <[email protected]>

packages/apollo-gateway/src/loadServicesFromStorage.ts

packages/apollo-server-core/src/types.ts

Co-Authored-By: Trevor Scheer <[email protected]>

This reverts commit 4d4ab5b.

To correct the syntax error. Co-Authored-By: Trevor Scheer <[email protected]>

…3856) Co-authored-by: WhiteSource Renovate <[email protected]>

Ref: #3811 (comment)

No particular reason, but I just didn't enjoy the previous wording (my own!).

These have been released!

…ents

Previously, when attempting to compose a schema from a downstream service in unmanaged mode, the unavailability of a service would not cause composition to fail. Given a condition when the remaining downstream services are still composable (e.g. they do not depend on the unavailable service and it does not depend on them), this could still render a valid, but unintentionally partial schema. While a partial schema is in many ways fine, it will cause any client's queries against that now-missing part of the graph to suddenly become queries which will no longer validate, despite the fact that they may have previously been designed to fail gracefully during degradation of the service. Rather than simply logging errors with `console.error` in those conditions, we will now `throw` the errors. Thanks to changes in the upstream invokers' error handling (e.g.#3811), this `throw`-ing will now prevent unintentionally serving an incomplete graph.

Apollo-Orig-Commit-AS: apollographql/apollo-server@264547c

…l/abernix/gateway-minor-qol-improvements gateway: Small improvements to error messages and behavior during updates. Apollo-Orig-Commit-AS: apollographql/apollo-server@2094947

…llographql/apollo-server#3867) Previously, when attempting to compose a schema from a downstream service in unmanaged mode, the unavailability of a service would not cause composition to fail. Given a condition when the remaining downstream services are still composable (e.g. they do not depend on the unavailable service and it does not depend on them), this could still render a valid, but unintentionally partial schema. While a partial schema is in many ways fine, it will cause any client's queries against that now-missing part of the graph to suddenly become queries which will no longer validate, despite the fact that they may have previously been designed to fail gracefully during degradation of the service. Rather than simply logging errors with `console.error` in those conditions, we will now `throw` the errors. Thanks to changes in the upstream invokers' error handling (e.g.apollographql/apollo-server#3811), this `throw`-ing will now prevent unintentionally serving an incomplete graph. Apollo-Orig-Commit-AS: apollographql/apollo-server@2562ad3

abernix requested a review from trevor-scheer February 20, 2020 19:37

trevor-scheer approved these changes Feb 21, 2020

View reviewed changes

abernix force-pushed the abernix/gateway-minor-qol-improvements branch 4 times, most recently from 2efa066 to b8901dc Compare February 27, 2020 16:06

trevor-scheer changed the base branch from release-2.11.0 to release-2.12.0 March 3, 2020 19:26

abernix added 17 commits March 3, 2020 11:27

Adjust error message wording to be more concise.

827cba2

Move "previous" variables closer to where they are used.

7250bfe

No need to have them so far away from their usage and to even do these assignments at all when there are already escape routes that exit this code path.

Align error messages with consistent terminology and add clarity.

c1ee91e

* Refer to the thing we are processing consistently as "schema definitions". * Also make them generally more factual.

logging: Raise severity from warn to error for what is certainly …

c3fb481

…an error.

typings: Remove any type!

846e59e

Use the same fetch response handling for the waterfall of GCS requests.

ee4c9ac

Overall, the success/failure behavior should be expected to be similar for all of these requests, as they all access the same GCS store.

Ensure the executor is still set on ApolloServer when load reje…

f878c7f

…cts. Because we want the actual executor to work when a schema might eventually become available, as it may when `onSchemaChange` hooks eventually succeed.

Expect the expected errors, rather than swallowing them.

4f2fbff

trevor-scheer force-pushed the abernix/gateway-minor-qol-improvements branch from be8e5a8 to 4f2fbff Compare March 3, 2020 19:27

trevor-scheer reviewed Mar 3, 2020

View reviewed changes

packages/apollo-gateway/src/__tests__/gateway/executor.test.ts Outdated Show resolved Hide resolved

Update packages/apollo-gateway/src/__tests__/gateway/executor.test.ts

03038e3

Co-Authored-By: Trevor Scheer <[email protected]>

trevor-scheer approved these changes Mar 4, 2020

View reviewed changes

packages/apollo-gateway/src/loadServicesFromStorage.ts Outdated Show resolved Hide resolved

packages/apollo-server-core/src/types.ts Show resolved Hide resolved

abernix and others added 2 commits March 3, 2020 17:07

Update packages/apollo-gateway/src/loadServicesFromStorage.ts

4d4ab5b

Co-Authored-By: Trevor Scheer <[email protected]>

Revert "Update packages/apollo-gateway/src/loadServicesFromStorage.ts"

0a50943

This reverts commit 4d4ab5b.

abernix and others added 3 commits March 4, 2020 07:45

Re-apply 4d4ab5b: apollo-gateway/src/loadServicesFromStorage.ts

554e20c

To correct the syntax error. Co-Authored-By: Trevor Scheer <[email protected]>

chore(deps): update dependency gatsby-theme-apollo-docs to v4.0.13 (#…

a75a366

…3856) Co-authored-by: WhiteSource Renovate <[email protected]>

Add comment to preserve concern about typings from @trevor-scheer.

81a72e2

Ref: #3811 (comment)

abernix added this to the Release 2.12.0 milestone Mar 6, 2020

abernix added 6 commits March 6, 2020 12:04

Change error message to not include word "lacks".

1400905

No particular reason, but I just didn't enjoy the previous wording (my own!).

Add more helpful error message and docs link to fetchApolloGcs.

5dc847a

Remove CHANGELOG.md annotations for pre-release version designators.

3d3ed9d

These have been released!

Merge remote-tracking branch 'origin/master' into release-2.12.0

0c55de9

Merge branch 'release-2.12.0' into abernix/gateway-minor-qol-improvem…

bba47c9

…ents

Add CHANGELOG.md for #3811.

264547c

abernix merged commit 2094947 into release-2.12.0 Mar 6, 2020

abernix mentioned this pull request Mar 6, 2020

Fail to compose when a service's federated SDL cannot be retrieved. #3867

Merged

abernix deleted the abernix/gateway-minor-qol-improvements branch March 16, 2020 19:18

abernix mentioned this pull request Mar 23, 2020

Apollo Gateway Error: Expected undefined to be a GraphQL schema. #3480

Closed

abernix added a commit to apollographql/federation that referenced this pull request Sep 4, 2020

Add CHANGELOG.md for apollographql/apollo-server#3811.

2dae9cd

Apollo-Orig-Commit-AS: apollographql/apollo-server@264547c

github-actions bot locked as resolved and limited conversation to collaborators Apr 21, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gateway: Small improvements to error messages and behavior during updates. #3811

gateway: Small improvements to error messages and behavior during updates. #3811

abernix commented Feb 20, 2020

gateway: Small improvements to error messages and behavior during updates. #3811

gateway: Small improvements to error messages and behavior during updates. #3811

Conversation

abernix commented Feb 20, 2020