-
Notifications
You must be signed in to change notification settings - Fork 257
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor selection set implementation to be immutable #2387
Conversation
👷 Deploy request for apollo-federation-docs pending review.Visit the deploys page to approve it
|
🦋 Changeset detectedLatest commit: 11954af The changes in this PR will be included in the next version bump. This PR includes changesets to release 7 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
This pull request is automatically built and testable in CodeSandbox. To see build info of the built libraries, click here or the icon next to each commit SHA. |
5f584cf
to
d65fcb8
Compare
The current (before this commit) implementation of `SelectionSet` is inherently mutable: its `SelectionSet.add` method mutate the selection set it's called on. Which is at odds with the general query planner algorithm, which is the main user of `SelectionSet`, but is fundamentally based on dealing with immutable data (as we explore various "path" options and the possible plans, most things are shared (at least partly)). This mismatch between the query planner that really want immutable selection sets but an implementation that is mutable leads to less than optimal code complexity and performance. More specificially, if we want to use a `SelectionSet` into 2 separate branch of the algorithm, we usually have to do defensive deep copies, which is inefficient. And to mitigate those inefficiencies, we sometimes don't do those copies, but this lead to code that fragile and easy to break and has lead to bug previously (where we save a copy, but then a branch mistakenly mutated and thus impacted another branch mistakenly). A few tricks had been added to the implementation to help mitigate the risks (the `freeze` behaviour, and the copy-on-write in `FetchGroup`), but they add complexity and aren't always optimal performance wise. This commit thus rework the implementation/api of `SelectionSet` to make it an immutable data structure. Doing so makes the code a lot safer and easy to reason about. And it also provide some performance benefits: compared to current `main` over a set of production schema/query, this seem to provide around 20% improvement on average and up to almost 50% improvment on some specific cases (disclaimer: those benchmark are fairly unscientific so those numbers should be taken with a grain of salt, but the numbers are clear enough that this is a net measurable gain).
d65fcb8
to
ae71a71
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review still in progress
Co-authored-by: Trevor Scheer <[email protected]>
Co-authored-by: Trevor Scheer <[email protected]>
@@ -672,10 +631,14 @@ class FetchGroup { | |||
readonly rootKind: SchemaRootKind, | |||
readonly parentType: CompositeType, | |||
readonly isEntityFetch: boolean, | |||
private _selection: LazySelectionSet, | |||
private _inputs?: LazySelectionSet, | |||
private _selection: MutableSelectionSet<{ conditions: Conditions}>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's probably past time to change this to named parameters, especially when there are so many optionals.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a private constructor and the public methods that calls it do use named parameters. Moving to named parameters here means separately declaring all those field and having trivial assignments for each. That'd be fine but it's a lot of noise and I like it the way it is.
} | ||
|
||
|
||
static of(selectionSet: SelectionSet): MutableSelectionSet { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's indeed not used anymore, but I'd like to keep it both for symmetry with empty
and because it make sense API wise and has a high change of being use soon enough (and it really isn't, it's hardly a maintenance burden).
Outside of minor typos/updates, the bulk of this change is switching how we collect used variables to be more efficient/avoid generating useless garbage.
It turns out that we have had (since fed 2.0 at least, maybe earlier) a slightly weird behaviour (a bug really) in the validation of selection sets which impacts what we accept for the `fields` arg of `@key`, `@requires` and `@provides`. Namely, assuming a field `t` returnin an object type, we were accepting a selection like: ``` { t { a b } t } ``` even though the 2nd occurence of `t` is kind of incorrect according to the graphQL spec since `t` should always have a sub-selection. But the reason it works is that the code was merging the 1st and 2nd occurrence of `t` before any validation was run, so internally the selection is handled as just: ``` { t { a b } } ``` Now, an assertion was added in apollographql#2387 that is triggered by the example above, and that means that some `@key`, `@provides` or `@requires` that were accepted (and were mostly correctly working) in currently released versions would start erroring in 2.4 because of this. To be clear, the historical behaviour is kind of wrong here, and we should consider fixing it at some point. However, hard-failing on upgrades is not very nice: we should probably introduce a warning for a few versions before genuinely making this an error. Further, the current assertion does not provide a very user friendly message. In the meantime, this PR restore the status quo.
It turns out that we have had (since fed 2.0 at least, maybe earlier) a slightly weird behaviour (a bug really) in the validation of selection sets which impacts what we accept for the `fields` arg of `@key`, `@requires` and `@provides`. Namely, assuming a field `t` returnin an object type, we were accepting a selection like: ``` { t { a b } t } ``` even though the 2nd occurence of `t` is kind of incorrect according to the graphQL spec since `t` should always have a sub-selection. But the reason it works is that the code was merging the 1st and 2nd occurrence of `t` before any validation was run, so internally the selection is handled as just: ``` { t { a b } } ``` Now, an assertion was added in #2387 that is triggered by the example above, and that means that some `@key`, `@provides` or `@requires` that were accepted (and were mostly correctly working) in currently released versions would start erroring in 2.4 because of this. To be clear, the historical behaviour is kind of wrong here, and we should consider fixing it at some point. However, hard-failing on upgrades is not very nice: we should probably introduce a warning for a few versions before genuinely making this an error. Further, the current assertion does not provide a very user friendly message. In the meantime, this PR restore the status quo.
The previously committed [apollographql#2713](apollographql#2713) fixed an issue introduced by [apollographql#2387](apollographql#2387), ensuring that querying the same field with different directives applications was not merged, similar to what was/is done for fragments. But the exact behaviour slightly differs between fields and fragments when it comes to `@defer` in that for fragments, we never merge 2 similar fragments where both have `@defer`, which we do merge for fields. Or to put it more concretely, in the following query: ```graphq query Test($skipField: Boolean!) { x { ... on X @defer { a } ... on X @defer { b } } } ``` the 2 `... on X @defer` are not merged, resulting in 2 deferred sections that can run in parallel. But following [apollographql#2713](apollographql#2713), query: ```graphq query Test($skipField: Boolean!) { x @defer { a } x @defer { b } } ``` _will_ merge both `x @defer`, resulting in a single deferred section. This fix changes that later behaviour so that the 2 `x @defer` are not merged and result in 2 deferred sections, consistently with both 1) the case of fragments and 2) the behaviour prior to [apollographql#2387](apollographql#2387).
…ly (#2720) The previously committed [#2713](#2713) fixed an issue introduced by [#2387](#2387), ensuring that querying the same field with different directives applications was not merged, similar to what was/is done for fragments. But the exact behaviour slightly differs between fields and fragments when it comes to `@defer` in that for fragments, we never merge 2 similar fragments where both have `@defer`, which we do merge for fields. Or to put it more concretely, in the following query: ```graphq query Test($skipField: Boolean!) { x { ... on X @defer { a } ... on X @defer { b } } } ``` the 2 `... on X @defer` are not merged, resulting in 2 deferred sections that can run in parallel. But following [#2713](#2713), query: ```graphq query Test($skipField: Boolean!) { x @defer { a } x @defer { b } } ``` _will_ merge both `x @defer`, resulting in a single deferred section. This fix changes that later behaviour so that the 2 `x @defer` are not merged and result in 2 deferred sections, consistently with both 1) the case of fragments and 2) the behaviour prior to [#2387](#2387).
The current (before this commit) implementation of
SelectionSet
is inherently mutable: itsSelectionSet.add
method mutate the selection set it's called on. Which is at odds with the general query planner algorithm, which is the main user ofSelectionSet
, but is fundamentally based on dealing with immutable data (as we explore various "path" options and the possible plans, most things are shared (at least partly)).This mismatch between the query planner that really want immutable selection sets but an implementation that is mutable leads to less than optimal code complexity and performance. More specificially, if we want to use a
SelectionSet
into 2 separate branch of the algorithm, we usually have to do defensive deep copies, which is inefficient. And to mitigate those inefficiencies, we sometimes don't do those copies, but this lead to code that fragile and easy to break and has lead to bug previously (where we save a copy, but then a branch mistakenly mutated and thus impacted another branch mistakenly). A few tricks had been added to the implementation to help mitigate the risks (thefreeze
behaviour, and the copy-on-write inFetchGroup
), but they add complexity and aren't always optimal performance wise.This commit thus rework the implementation/api of
SelectionSet
to make it an immutable data structure. Doing so makes the code a lot safer and easy to reason about. And it also provide some performance benefits: compared to currentmain
over a set of production schema/query, this seem to provide around 20% improvement on average and up to almost 50% improvment on some specific cases for the computation of query plans (disclaimer: those benchmark are fairly unscientific so those numbers should be taken with a grain of salt, but the numbers are clear enough that this is a net measurable gain).