-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[validation] Add "onError" option to allow for custom error handling behavior when performing validation #2074
Conversation
1eadb9f
to
cea8a25
Compare
@skevy Thanks for PR 👍
We definitely should make I also think you uncovered potential DoS vector, e.g.:
For very small query size (gzip will optimize all repetition) it will produce a huge number of same error but with a different location. At the same time, limiting the number of the possible error to 1 is too restrictive especially for public APIs that need to protect them self against DoS attacks but don't want to sacrifice DX too much. I think the correct solution would be to add
Also, it can be used in other situations, e.g. if validation is done in a separate thread (e.g. WebWorker) @skevy What do think? Will |
@IvanGoncharov thanks for the thoughtful answer! Yes, I’m happy to do a more flexible I think in any comms about this we shouldn’t recommend limiting the time of execution, as this blocks the event loop and as you said is a DoS vector. But certainly allowing more than one error to be thrown seems reasonable. I’ll update the PR today. |
Btw, regarding optimizing the |
be3051b
to
6632da6
Compare
@IvanGoncharov updated the PR accordingly. |
6632da6
to
4edebb5
Compare
…behavior when performing validation
4edebb5
to
953adaa
Compare
@skevy Thanks 👍 Look good. Since it's public API I want to take a couple more days to think about alternatives. |
@skevy Sorry for the delay. |
@skevy I plan to release |
We'll use the new `onError` callback which was introduced in `[email protected]` by graphql/graphql-js#2074 only if the `getErrors` method that existed prior to that addition no longer exists.
This PR makes the minimal number of changes necessary for the apollo-tooling's repository's tests to pass for the planned [email protected] release. Notable changes to implementations are as follows: an adjustment to account for the new onError callback introduced by graphql/graphql-js#2074, which deprecated the previously used ValidationContext.prototype.getErrors method. utilize the GraphQLEnumValueConfig type instead of GraphQLEnumValue (which lead to the removal of the name field, which isn't actually needed when building the values for the GraphQLEnumType constructor. Ref: graphql/graphql-js#2303 Co-authored-by: Trevor Scheer <[email protected]>
TL;DR: This PR adds an "onError" argument to the
validate
function that allows for custom error handling when performing validation. A user can use this in many ways -- for instance, they can use this callback to bail early, and prevent performing potentially expensive validation operations hundreds or thousands of times in environments where detailed error messages are less important.We should fix the root cause of the slowness that I describe below, however -- it's risky to have a utility like this enabled in production, given that it gets slower and slower the more types you have and the more errors you have in your query.
Read on for the story of why I decided to implement this. 😀
At Airbnb, we have a GraphQL server that acts as a gateway to many other GraphQL servers -- the schemas are stitched together from these downstream services to build a single schema that's used by our clients.
Today, in our staging environment, one of the downstream services published a broken schema that caused queries to the gateway to start failing with validation errors. When this happened, we saw the gateway's measured overhead (essentially the time it takes to do anything GraphQL-related that isn't fetching data from downstream services) jump by about 3000%:
After getting alerted on a this issue, I captured a trace. This is what I saw:
Zooming in a bit:
It turned out that the
suggestionList
utility, which uses the Levenshtein distance algorithm to build a list of suggestions for a given type or argument name, was being called many times as the validation visitor failed to find the types that were lost in the main schema due to the broken schema from the downstream service.To give an idea of scale -- we have a moderately large GraphQL schema with about ~3000 types, and many of the type names are very long, with an average length of 27 characters.
After some investigation (and building a repro case), it seemed like the options were to either fix the
suggestionList
algorithm to do something more efficient, or allow bailing out early. This PR is an implementation of the latter.At Airbnb, we disable introspection on our production GraphQL endpoint and point our internal GraphiQL instance at a production mirror that's only available internally. The idea with this PR is that we will make validation bail early so that we can take steps to make sure the validation time doesn't increase with the number of errors in the query.