-
Notifications
You must be signed in to change notification settings - Fork 730
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Paginated Watchers for GraphQL Queries #3007
Conversation
👷 Deploy request for apollo-ios-docs pending review.Visit the deploys page to approve it
|
private let createPageQuery: CreatePageQuery | ||
private let nextPageTransform: (T?, T) -> T | ||
|
||
private var model: T? // 🚗 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'm so sorry i couldn't help myself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a great start on this! It looks really solid.
The biggest point of discussion here in my mind is if this is the right general approach. I think there are other viable approaches to handling this that we should consider also.
For instance, would some form of exposing merging strategies for paginated data in the cache be preferable. Allowing you to fetch a new page, but configure some function that, when new pages are fetched, actually writes them to the cache in a way that just updates an original query watcher's dependent keys.
There are a lot of additional complications there, because we would be departing from the concept of keying cached fields based on just the values of the field arguments passed. I'm not sure if this would make things better or open a can of worms that would cascade into a bunch of other bugs. But it is something worth further consideration.
A big part of the reason we haven't implemented pagination support is because there are so many things that can go wrong based on your specific use case. Having people build custom mechanisms for their specific situation has been easier.
This PR is a really great start to one approach to handling pagination. But any solution that we provide needs to be as stable and consistent as possible. This makes me think there is more work to be done here.
Ideally, any solution here would provide for a really lightweight way to do this for the most common use cases.
There needs to be some consideration for how to merge the list pages together. Your nextPageTransform
allows users to do this in-memory, though it doesn't support merging them in the cache and probably requires a lot of boiler plate code in most situations. I'd love to provide default handlers for common use cases.
For example a list of entities that can be uniquely identified by a cache key or identifier field and sorted by some field could probably have a pre-defined function for handling the next page transformation.
Chatted with Anthony on Friday, the 12th. We decided to try to investigate a way of managing this with a single watcher. Since then, investigated using a single watcher and updating its dependent keys. Take the following query: query($id: ID!, $after: String) {
hero(id: $id) {
id
name
friendsConnection(first: 2, after: $after) {
friends {
name
id
}
}
}
} Under that strategy, if we were to fetch a second page, we could add the dependent keys of the second page to the first watcher. However, the initial watcher would never pull the data for those keys because when we load a value from the store, we do so by passing in the operation type and its variables. While the nodes underneath Therefore, the cache would store them like so:
And while the values stored under the connections are simply reference pointers to the entities, there is currently no way to group these values together under the same key. With that in mind, there are a few ways to handle this (h/t to Anthony here, as this section is directly adapted from his messages):
My immediate thoughts are that each of these two paths have clear pros and cons: Multi-Watcher StrategyThe multi-watcher strategy -- implemented in this pull request -- would work. However, it's a bit of an opinionated outcome (and could be made a bit less opinionated with some tweaking, more on that later), since it currently operates above the level of Pros:
Cons:
Some of the cons of this strategy can be mitigated, specifically, operating over Single Watcher StrategyThe single watcher strategy, mentioned in the early part of this comment, could also work. It may require some changes to codegen (i.e., directive or similar), or a specific additive API to the cache to conditionally strip fields (a user could mark a parent field as a connective type, for example). This strategy also has its pros and cons: Pros:
Cons:
On the face of it, Single Watcher seems more complex to implement and comes some additional operations over the cache -- but has the massive benefit of operating over the level of |
ec9ab16
to
1492973
Compare
7d08204
to
7fa6f54
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for all the amazing work on this @Iron-Ham!
I think the overall approach with multiple query watchers is the right way to go, and the implementation of the logic to execute the queries, create new watchers, and handle the result data looks really good.
I do think there are still some overall design questions I'd like us to hash out more. I've summarized a lot of my thoughts here. Looking forward to talking with you about this more on Monday!
Result Handler Placement
Right now, the resultHandler
that actually gets called with a PaginatedOutput
when new data is fetched is only referred to and defined directly on the implementations of the RelayPaginationStrategy
and the OffsetPaginationStrategy
. There might be a reason for this that I'm not seeing, but I'm feeling like the placement of the result handler is confusing.
My initial assumption is that this would be on the GraphQLPaginatedQueryWatcher
itself, but when I didn't see it there, I looked at the PaginationStrategy
protocol and there was nothing about the resultHandler
there either. It's actually just an implementation detail of the actual implementations of PaginationStrategy
. This took me a long time to find, and it was really confusing to me how to even actually use the GraphQLPaginatedQueryWatcher
because of it.
It seems you could create a new PaginationStrategy
and not include a resultHandler
at all, rendering the entire thing mostly dysfunctional? Unless your PaginationStrategy
itself is hard-coded to do whatever you wanted with the result data, (which I would see as violation of separation of responsibilities) you need to write the resultHandler
code yourself. Was this intentional?
Would it be viable to move the resultHandler
so that it's part of the GraphQLPaginatedQueryWatcher
? I'd like to be able to do this:
let watcher = GraphQLPaginatedQueryWatcher(
client: client,
strategy: ...,
resultHandler: { result in
...
}
)
watcher.fetchMore()
It actually looks like maybe this was intended to work this way at some point and it changed? I'm guessing that because GraphQLPaginatedQueryWatcher
has a typealias ResultHandler
that is currently unused.
Initial Query Construction
It feels ergonomically awkward that we are creating queries in two different places. You have the initialQuery
parameter on the GraphQLPaginatedQueryWatcher
initializer, and also the NextPageStrategy
creating the rest of the queries. I'm wondering if we can encapsulate all of that in the NextPageStrategy
.
If we do that, I think we should rename that protocol from NextPageStrategy
to PaginatedQueryProvider
.
PaginationStrategy
There is a lot I'm not understanding about the design of PaginationStrategy
right now. It seems like a lot is going on in the implementation of onWatchResult()
in the actual implemented strategies that is currently repetitive. I'm feeling like much of that should either be in a protocol extension on PaginationStrategy
or in the GraphQLPaginatedQueryWatcher
itself.
As it stands now I don't think there is a reason for the PaginationStrategy
protocol to require the pageExtractionStrategy
, outputTransformer
or mergeStrategy
. These are only used in the implementation details of the two strategies we have implemented. I don't think we should actually be removing them, but rather that the usage of them is a commonality that should be refactored into a place where it is reused.
Page Data Merging & Data Transformation
I'm getting the feeling that the process of transforming individual page data, and then merging it together into the final PaginationResult
with it's output is too might be a bit redundant in most cases.
It seems like for the majority of cases, transforming the output data and merging the pages together could be done at the same time. If you just use the PassthroughDataTransformer
you can handle the logic for merging the Query.Data
from both queries in the mergePageResults(paginationResponse:)
function on the merge strategy and output whatever type of view model or other Output
you want from there.
I'm struggling to come up with a situation in which using a custom DataTransformer
to provide the intermediate models is really valuable or necessary. But I suspect I may be missing something there. I'm currently leaning towards suggesting that we remove the DataTransformer
completely, and just do what the PassthroughDataTransformer
is currently doing -- leaving all custom data transformation to the PaginationMergeStrategy
. If there is some use case I'm missing that makes the current approach preferable, we should discuss that.
But even if we do keep this, I'd like to tweak the APIs here so we make this feel more like an opt-in thing rather than an opt-out. PassthroughDataTransformer
should be the default and most straightforward way to do this, and you only need to provide additional things if you want to do something more advanced.
Thread Safety
I debated not even bringing this up. Part of me says that we shouldn't even really be handling this too much and let users deal with making sure their code is thread safe, but I'd like to consider it nonetheless.
Do we need to make the watchers
array (and probably some other things) atomic? We have an @Atomic
property wrapper we could probably use for some of this already.
What if the consumer ends up calling fetchMore()
from multiple threads in parallel? Should we be preventing new fetches from occurring while one is already happening, or queue them up so it does them serially? It's also possible that we fetch multiple different pages in parallel, but then we would need to make the tracking of currentPage
atomic and update it at the appropriate time to ensure that the next page is always calculated properly.
I think we should talk about what the desired behavior is here and ensure that we are accounting for it properly. I'm not opposed to us saying "This isn't thread safe, so don't do crazy stuff." but that should be well documented and the expected behaviors and limitations should be known.
public typealias Page = Strategy.NextPageConstructor.Page | ||
private typealias ResultHandler = (Result<GraphQLResult<Strategy.Query.Data>, Error>) -> Void | ||
|
||
private let client: any ApolloClientProtocol |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GraphQLQueryWatcher
has always held a weak
reference to the client. I'm wondering if this should do the same. I don't imagine that this would create a retain cycle, as the retain cycle would be broken as soon as you call cancel()
.
Any thoughts on this? @Iron-Ham @calvincestari @BobaFetters
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this sounds reasonable
where OutputTransformer.Query == Query, OutputTransformer.Output == Output | ||
|
||
associatedtype NextPageConstructor: NextPageStrategy | ||
where NextPageConstructor.Page == PageExtractor.Page, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we might be able to use the new Swift protocol primary associated types to make this a bit cleaner?
import Foundation | ||
|
||
/// A pagination strategy to be used with Relay-style cursor based pagination. | ||
public class RelayPaginationStrategy< |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if we should specifically call reference "Relay" here or just rename it to something like CursorBasedPaginationStrategy
and have a note in the documentation indicating that this is set up to work specifically with the relay connection spec (https://relay.dev/graphql/connections.htm)?
Relay itself is a client, but the cursor based pagination spec is just a spec for GraphQL schemas and backends to use in order to be consumed by the Relay client.
Just concerned that people who don't use Relay (or aren't very familiar with it) will not know that they can use this strategy with a standard cursor based pagination. If you are using Apollo iOS, you are decidedly not using the Relay client. I don't want people to assume that this is tightly coupled to using relay.
@bignimbus What do you think about the naming and messaging here? Anyone else who's input we should get on this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call-out here, thanks for the mention. I don't have a strong personal opinion here. In Apollo Client a similar utility is called relayStylePagination
. So there's precedent in case that helps inform the decision.
#endif | ||
|
||
/// The strategy by which we create the next query in a series of paginated queries. Gives full custom control over mapping a `Page` into a `Query`. | ||
public struct CustomNextPageStrategy<Page: Hashable, Query: GraphQLQuery>: NextPageStrategy { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I see the value in including this at all. I think this goes for the CustomDataTransformer
and CustomPaginationMergeStrategy
as well. If people need to implement a custom implementation of any of these, you should just define your own types that conforms to the protocols.
I guess I see why it is ergonomically valuable to be able to use these to essentially turn the initialization of PaginationStrategy
into closure based parameters (as is shown in the unit tests for this), so I can be persuaded here. Let's discuss this more with the team.
If it makes sense, what I would like to include is an implementation of a NextPageStrategy
that does this for you given just a query type specifically for the cursor-based and offset + count pagination methods?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We discussed a bit on our call.
I think we could probably collapse both CustomNextPageStrategy
and the need for an initialQuery
into a new QueryProvider
protocol that takes in a Page?
as an argument (as opposed to the Page
that we take in now).
We could default to generating the initial query if the Page
is nil.
|
||
/// Convenience initializer | ||
/// - Parameter arrayKeyPath: A `KeyPath` over a `Query.Data` which identifies the array key within the `Query.Data`. | ||
public init(arrayKeyPath: KeyPath<Query.Data, [(some SelectionSet)?]?>) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't account for 2 things:
- Lists of scalar/custom scalars
- Nested lists (2D, 3D, etc.)
I'm wondering if it would work if we change this to:
public init(arrayKeyPath: KeyPath<Query.Data, some Collection>) {
Hey all, been following this PR with great interest as I definitely will be needing this soon. It looks like it’s in good shape with some outstanding code review comments and I wanted to ask if there was anything I could do to assist - even if it’s just opening a PR against it to resolve merge conflicts. Please let me know (I’ll be able to do this on company time :D). @Iron-Ham @AnthonyMDev |
@swizzlr for the time being, work on this PR will continue here: https://github.com/apollographql/apollo-ios-pagination Once the API hits a stable state, it may (or may not?) be merged directly into Apollo-iOS FWIW: we've been using this in production at GitHub since this PR was opened and it's working well for us |
Thanks @Iron-Ham for all your work on this, I have also been watching this PR's career with great interest. Question about how paginatedWatcher interacts with Relay style pagination and item insertion. When a query with relay style pagination results has a new item inserted in the list, this affects all pages, does the paginatedWatcher understand this and update the pages and queries? For Example Let's say you have [A,B,C,D,E] data which is fetchable via the relay-style paginated "SomeQuery" Given these two simplified paginated queries/results: If an item is inserted at the beginning of this list, while these queries are watched, the first watcher will fire its handler, lets say this new item is N the data is now [N,A,B,C,D,E] the state of our watchers/queries and returns becomes: Watcher 1: Watcher 2: Unless watcher 2 query is updated, we will lose the last item in the first query/watcher. Does the paginatedWatcher handle this case and update the endCursor in all queries? or is there a better way to handle this case? |
@matthewweldon That's a great question! Let's say that we've inserted Initial Page0 Watcher Return: When we paginate at this point, we will continue to paginate using our end-cursor, which is tied to object Let's say that the user pulls-to-refresh. If we call the At this point, if we were using a passthrough data transformer and a targeted pagination merge over a specific list, our resulting list in GraphQL would look something like this:
This is incorrect, but we can:
Truthfully, at GitHub we don't tend to use Apollo's generated models other than to bind to our own view models. Given that, we've abstracted away setting a pagination merge strategy in our codebase; we always use an indexed collection to merge all pages, which accomplishes both the correct position for these values as well as de-duping. Additionally, since we're not working with generated models, that means that we don't tend to use passthrough transforms. @matthewweldon I'm really glad you brought this up though -- because this is clearly a bug that we didn't run into in our usage. |
We've decided to move this over to it's own separate package which lives in this repo. Work is still ongoing on this feature, but any additional discussions and PRs should be made there instead. I'm hoping we can get this feature to a stable point and release it for consumer use in the coming weeks/months. We are continuing to work with @Iron-Ham who is leading the implementation efforts on this. Thanks! |
What are you trying to accomplish?
I'd like to have an easy way to watch a paginated query.
My frustrations with #3004 stemmed from one place: I wanted to have an easy manner of consumers to automate pagination; I didn't want client engineers writing a new
LocalCacheMutation
for each paginated query.So this PR introduces just that: A strategy for watching paginated queries.
closes #3004
Related: "[issue TBA] Better handling of paginated data" via #1701
closes #565
What approach did you choose and why?
I introduced a
GraphQLPaginatedQueryWatcher
, which can paginate over a query and watch the results for each page.Generally, it works by composing the results of several
GraphQLQueryWatcher
s. On instantiation of a newGraphQLPaginatedQueryWatcher
, we ask for one additional field over a standardGraphQLQueryWatcher
:strategy
: APaginationStrategy
which defines how we go about paginating a query.Let's break down the
PaginationStrategy
:PaginationStrategy
At its core, this protocol is composed of several other protocols:
PageExtractionStrategy
: A protocol which defines how we extract aPage
from a givenInput
. It is agnostic as to what aPage
is, and it is equally agnostic as to what anInput
is. AnInput
can be the query's response data, a set of variables, or any other value. At present, we have onePageExtractor
, purpose-built for Relay-style pagination.DataTransformer
: A protocol which transforms the query's response data into a givenOutput
. At present, there are two transformers implemented:PassthroughDataTransformer
: Simply allows the query's data to passthrough, unaltered.CustomDataTransformer
: Allows the user fine-grained control over their data. Using this transformer, the user can output their own custom objects.NextPageStrategy
: Constructs a newQuery
, given aPage
.PaginationMergeStrategy
: A protocol whose responsibility it is to take aPaginationDataResponse
, which contains all of theOutput
values of each page, and merges them into one validOutput
.These protocols are stored as local variables on the strategy, along with an array of
Page
objects and thecurrentPage
. Additionally, this protocol has three function requirements:onWatchResult
: The callback triggered by aGraphQLQueryWatcher
.canFetchNextPage
: Whether the strategy can fetch another page, given its own internal state.reset
: Resets the internal state of thePaginationStrategy
.Let's look at the three defined
PaginationMergeStrategy
s:CustomPaginationMergeStrategy
Allows the user fine-grain control over how information is merged. Intended to be used with custom objects.
SimplePaginationMergeStrategy
Naively merges all lists together.
TargetedPaginationMergeStrategy
Takes in a
KeyPath
to a list and specifically merges only that list together.Why take a multi-watcher approach?
I wrote about it here, but that's slightly out of date. Most of it remains valid, but fundamentally the reason boils down to the path of least resistance. I fully concede it's not the perfect solution. However, since both single-watcher and multi-watcher have pros and cons, it's a matter of weighing tradeoffs. I chose to go for the path that can be rapidly implemented.
Anything you want to highlight for special attention from reviewers?
I'd love feedback on this approach. I think it should work for most users and provides an escape hatch for users for whom the
Simple
strategy won't work.