Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Request batching #266

Closed
wants to merge 24 commits into from
Closed

WIP: Request batching #266

wants to merge 24 commits into from

Conversation

Poincare
Copy link
Contributor

@Poincare Poincare commented Jun 4, 2016

Goal

Implement request batching so that multiple requests can be sent to the server in one round trip.

Currently implemented

It is possible to send the queries to the server as a batch of requests (i.e. an array of request objects rather than a single request object) through QueryManager. In addition, there is a QueryScheduler that consumes a queue of queries, repeating over a particular interval. It has been implemented so that at most one query is in flight at a time per polling query (mentioned as an issue within Issue #248).

In progress

  • Hooking up the QueryScheduler to QueryManager so that instead of calling fetchQuery, queries are queued to the QueryScheduler which then consumes the queue on some interval.
  • Implementing timeouts for queries so that one query that takes too long within a polling query doesn't permanently pause that polling query.

Architecture

Before this PR, all of the query fetching logic lived in fetchQuery. This logic consisted of three steps:

  1. Transform a query and turn it into a string
  2. Pass that query to this.networkInterface
  3. In the then on the promise returned by this.networkInterface, store the results in the Redux store

This PR introduces a generalized version of fetchQuery that can operate over any NetworkInterface (see step 2 above) and calls it fetchQueryOverInterface. It also introduces a method fetchBatchedQueries to QueryManager that uses fetchQueryOverInterface in order to group together queries and send them over a particular networkInterface. Internally, fetchBatchedQueries calls batchedQuery on a network interface and lets the network interface handle the actual request formation. This allows us to batch arbitrary queries together.

The time-dependent stuff lives inside another class called QueryScheduler which operates on a queue of queries that it may batch together if possible.

Reasoning for Architecture

Had a quick conversation with @helfer that made me think more about the architectural decisions behind this PR. There are a couple of reasons why it makes sense to add fetchBatchedQueries to QueryManager and keep QueryScheduler separated:

  1. The decision of when to batch remains independent of everything else.

QueryManager exists as a mechanism to allow us to submit queries over the network interface. The polling interval on which we decide to batch queries may change depending on the polling queries currently in play. For example, if we used to have only 1 second polling queries and suddenly add three that query every 50ms, it doesn't make sense to batch on the same polling interval that we started out with. Since this information is naturally available to QueryManager (not the network interface, etc.) and QueryScheduler, it makes sense for them to have control as to when queries are batched together.

  1. The behavior of batching will still feel "automatic".

The ApolloClient class will still be used to call query as is. The decision of whether or not to batch will be available to the application developer but the actual batching will happen behind the scenes (i.e. the app. developer shouldn't need to call fetchBatchedQueries - the batching will "just happen").

  1. Being able to batch queries independently of the "automatic batching" may prove useful in the future.

I can imagine instances where two related queries could be batched together and we have promises for both queries that may be resolved at two different times. If we expose batching independently of the polling mechanism, we could (in the future) expose a way to batch these queries as well.

There are, however, downsides to using this approach that @helfer pointed out. Specifically, if we were to build a network interface that simply batched the queries together on some interval, then the QueryManager would be free of dealing with batching. Secondly, the network interface should be able to decide whether or not it can batch (e.g. batching makes less sense on WebSockets). However, this second point is resolved by the network interface hinting whether or not it will support batching.

Given the points above and the benefits we get by separating the batching and polling (pt. 1 definitely seems the most important to me), I think we should go for the architecture currently implemented. Thoughts?

TODO:

  • Update CHANGELOG.md with your change
  • Make sure all of the significant new logic is covered by tests
  • Rebase your changes on master so that they can be merged easily
  • Make sure all tests and linter rules pass

@zol zol added the in progress label Jun 4, 2016
@helfer
Copy link
Contributor

helfer commented Jun 6, 2016

I think keeping the polling scheduler separate from batching makes perfect sense. It would still be possible to do all the batching in the network interface in that case, even with "dumb" batching, i.e. merging root fields.

Since the network interface is the thing that talks to the server, it should be the place where we decide whether and how to batch. If the server supports batching over the transport, then that should be used. If not, the queries could be merged at the root field.

There are some advantages and some disadvantages to the root field merging type of batching, but I think we should implement it first, because it will take a while until servers can support any other batching meaningfully.

I think the query manager doesn't need to do the actual batching, but maybe it makes sense if it is able to submit multiple queries to the network interface at once by calling fetchManyQueries (I'm not sure about this). All network interfaces should have such a method, and then decide what to do with those queries depending on what types of batching (if any) they and the server support.

The network interface should not decide how long to queue queries if it does batching. That would be up to the query manager or the scheduler. The network interface should only batch queries that it receives in the same tick, be it through one call of fetchManyQueries or many calls of fetchQuery. fetchQuery should have an additional boolean argument batchable, which tells the network interface if queries can be batched.

@Poincare
Copy link
Contributor Author

Poincare commented Jun 6, 2016

That makes a lot of sense to me and that seems to be pretty close to the current design. So, I guess the next steps are:

  • Implement batching by merging root fields
  • Change the existing stuff in QueryManager into fetchManyQueries
  • Add the batchable option to fetchQuery

queryId: string;
};

// QueryScheduler takes a operates on a queue of QueryFetchRequests. It polls and checks this queue
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo?

@Poincare
Copy link
Contributor Author

Poincare commented Jun 8, 2016

This PR is now at a point where the query merging is working correctly (i.e. you can take any network interface and turn it into a network interface that does query merging), the query scheduler class has most of the methods it needs implemented and fetchManyQueries has been implemented. The batched transport has also been implemented (i.e. it can send an array of requests rather than a single request to the server) and most of the code covered under the unit tests.

@helfer Do you think we could get this to merge quality and then I'll implement all the gluing that will make the scheduler operate correctly? I think that way the changes in the base classes and methods (e.g. QueryScheduler and src/queryMerging.ts) can come without needing a rewrite within the stuff that actually makes QueryScheduler, QueryManager, etc. interact correctly.

@@ -283,7 +283,7 @@ export class QueryManager {
}

// Sends several queries batched together into one fetch over the transport.
public fetchBatchedQueries(fetchRequests: QueryFetchRequest[]): Promise<GraphQLResult[]> {
public fetchManyQueries(fetchRequests: QueryFetchRequest[]): Promise<GraphQLResult[]> {
// There are three types of promises used here.
// - fillPromise: resolved once we have transformed each of the queries in
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'transformed' is repeated by mistake, I think.

@@ -0,0 +1,137 @@
import {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think QueryScheduler.ts would be a better name for this file.

@helfer
Copy link
Contributor

helfer commented Jun 10, 2016

Couple of things to consider (just doing a braindump here):

  • What happens with unused variables in batched queries? Do we check if they're used first? If not, can we translate the name back to the original name? Or should we tell people to turn off batching for debugging?
  • Does query merging happen if there's just one query? I.e. do variables still get renamed?
  • What happens when a field name or variable name includes __ (two underscores)? Is there a check for it?

After an initial look, I think you should break out queryMerging into a separate PR. That will make this one much more manageable. Luckily, they're also logically separate.

This was referenced Jun 11, 2016
@Poincare
Copy link
Contributor Author

Thanks for the feedback @helfer. I'm closing this PR and replacing it with two separate PRs: #278 and #277. The latter seems to be closer to merge ready, the former requires some redesign as per our discussion.

@Poincare Poincare closed this Jun 11, 2016
@zol zol removed the in progress label Jun 11, 2016
@stubailo stubailo deleted the batching branch September 20, 2016 03:44
jbaxleyiii pushed a commit that referenced this pull request Oct 18, 2017
Remove wording about "just like above"
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Feb 2, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants