Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE REQ] improve spring-data-cosmos paging for non-trivial datasets #24262

Closed
2 tasks done
Blackbaud-MikeLueders opened this issue Sep 22, 2021 · 11 comments
Closed
2 tasks done
Assignees
Labels
azure-spring All azure-spring related issues azure-spring-cosmos Spring cosmos related issues. Client This issue points to a problem in the data-plane of the library. customer-reported Issues that are reported by GitHub users external to the Azure organization. feature-request This issue requires a new behavior in the product in order be resolved. needs-team-attention Workflow: This issue needs attention from Azure service team or SDK team
Milestone

Comments

@Blackbaud-MikeLueders
Copy link
Contributor

Is your feature request related to a problem? Please describe.
spring-data paging requires row counts in order to function.  Row counts in cosmos are generally expensive.  For datasets of non-trivial size (say, >100k), a count query which includes an order clause is orders of magnitude more expensive than the same query without the count clause.  In the spring-data-cosmos library, every 'paged' query also issues a count query in order to conform to the spring-data page model.  For collections of non-trivial size, this makes paged queries prohibitively expensive. For small datasets, this is not an issue and things work as you would expect.  For larger datasets, this makes paging unusable. 

Describe the solution you'd like
One option is to do away with the facade of paging in spring-data and lean into the continuation token for all operations. This would mean certain methods of org.springframework.data.domain.Page would simply fail - getTotalPages/getTotalElements/previous/etc - while hasNext would rely on the continuation token instead of total element calculation. There are other options like not failing but returning incorrect data, though I think failing would be better from a user perspective so as to not give the false illusion of functioning. In any case, the important part is the elimination of the count query.

This brings up two obvious issues. First, it breaks traditional paging for clients that are using spring-data-cosmos and do not have large datasets. Second, it is not backwards compatible with the existing implementation. To sidestep these issues, a configuration option could be introduced which would allow the client to select which paging strategy to use. The current functionality would remain the default but the user could opt into the continuation token strategy which would not require count queries on paged requests.

Thoughts? Am I missing something wrt to the count queries being extremely inefficient? Are there other approaches which should be considered? I'd like to get consensus on an approach before I begin work on a PR.

Additional context
To be clear, I realize I could use CosmosTemplate:sliceQuery to execute paged queries since no count query is executed as part of the slice method. However, this requires manually constructing the query and using the CosmosTemplate directly, as opposed to using dynamic query derivation via repository methods which accept a Pageable input. Using the CosmosTemplate for paged queries and the repository for all other queries is far from ideal.

Information Checklist
Kindly make sure that you have added all the following information above and checkoff the required fields otherwise we will treat the issuer as an incomplete report

  • Description Added
  • Expected solution specified
@ghost ghost added needs-triage Workflow: This is a new issue that needs to be triaged to the appropriate team. customer-reported Issues that are reported by GitHub users external to the Azure organization. question The issue doesn't require a change to the product in order to be resolved. Most issues start as that labels Sep 22, 2021
@joshfree joshfree added azure-spring All azure-spring related issues azure-spring-cosmos Spring cosmos related issues. Client This issue points to a problem in the data-plane of the library. labels Sep 22, 2021
@ghost ghost removed the needs-triage Workflow: This is a new issue that needs to be triaged to the appropriate team. label Sep 22, 2021
@joshfree joshfree added feature-request This issue requires a new behavior in the product in order be resolved. and removed question The issue doesn't require a change to the product in order to be resolved. Most issues start as that labels Sep 22, 2021
@ghost ghost added the needs-team-attention Workflow: This issue needs attention from Azure service team or SDK team label Sep 22, 2021
@joshfree
Copy link
Member

@stliu @kushagraThapar could you please take a look

@kushagraThapar
Copy link
Member

@Blackbaud-MikeLueders - let me discuss this issue within my team - Here is some general information that I found - https://docs.microsoft.com/en-us/azure/cosmos-db/sql/sql-query-pagination

@j82w - tagging you here, in case you have any ideas on how to implement pagination in our SDKs with cosmos queries in general.

@kushagraThapar
Copy link
Member

@Blackbaud-MikeLueders - We are still discussing this feature with our query team to make sure we cover every aspect of this.

Also, another limitation on continuation tokens is that they don't work for DISTINCT / GROUP BY queries, as mentioned here - https://docs.microsoft.com/en-us/azure/cosmos-db/sql/sql-query-pagination#continuation-tokens

@Blackbaud-MikeLueders
Copy link
Contributor Author

@kushagraThapar - No worries, thanks for the update. I think that's a reasonable limitation, especially since DISTINCT does work if you add an ORDER BY. Depending on the route we go, we're likely going to need to include documentation on how paging works with spring-data, so I think this will be fine as long as we clearly document the limitations and include workarounds for the DISTINCT/GROUP BY queries (e.g. using sliceQuery from CosmosTemplate).

The documentation specifically says "you cannot use continuation tokens" - do you know specifically what that means? Will a continuation token not be returned as a reply header on the initial query? So is it the case that the library would need to detect use of DISTINCT (without ORDER BY) and GROUP BY and disallow those queries if in 'continuation token' mode? Otherwise, it would 'work' in the sense that the first page of results would be returned but it would appear as there were no more results. Or is there another way to detect this?

@kushagraThapar
Copy link
Member

@Blackbaud-MikeLueders DISTINCT works with ORDER BY, it is mentioned in the above documentation that I linked.
Yes, regarding documentation, we should be able to well document our new pagination feature.

Regarding the continuation token, it will be returned from the initial page for the GROUP BY / DISTINCT query, however, SDK won't resume from that continuation token returned in the first page. It throws the error -

public static final String CONTINUATION_TOKEN_NOT_SUPPORTED_WITH_GROUP_BY = "Continuation token is not supported " +

@j82w
Copy link
Contributor

j82w commented Sep 30, 2021

One option is to do away with the facade of paging in spring-data and lean into the continuation token for all operations. This would mean certain methods of org.springframework.data.domain.Page would simply fail - getTotalPages/getTotalElements/previous/etc - while hasNext would rely on the continuation token instead of total element calculation. There are other options like not failing but returning incorrect data, though I think failing would be better from a user perspective so as to not give the false illusion of functioning. In any case, the important part is the elimination of the count query.

@Blackbaud-MikeLueders what if the getTotalPages/getTotalElements/etc is made to be lazy while making the hasNext depend on the the continuation token? This way it avoids the count query without causing a breaking change so most users will get the benefit without needing to opt-in via some option.

@Blackbaud-MikeLueders
Copy link
Contributor Author

@j82w yeah, i think that's probably a better approach.

We should still consider a configuration option for this - defaulting to lazy but configurable as eager. The reason being, some clients (such as my company) do not use the built-in autoscale mechanism of cosmos but instead rely on a proxy to the repository object which can be used to intercept method requests and scale up only after retry. We've found that the vast majority of throttle events do not require a scale up but just a single retry to succeed. If we were relying on the count, having the count operation part of the repository method call would be desirable since the retry/autoscale mechanism would kick in - executing the count lazily would require detecting the Page response and wrapping that in a separate proxy.

That said, I'm not going to fight for the configuration option since I can't think of a scenario I would use it (since I don't use paging in the first place) but I wanted to present an actual usage scenario for consideration, since there might be clients who rely on an eager count operation.

@j82w
Copy link
Contributor

j82w commented Oct 7, 2021

I agree with the configuration option.

@Blackbaud-MikeLueders
Copy link
Contributor Author

@kushagraThapar @j82w are we aligned on an approach? Should I begin working on a PR or are y'all still discussing?

@kushagraThapar
Copy link
Member

@Blackbaud-MikeLueders We are aligned, would be great to start the work and then we can continue the discussion on the PR.

@mlueders
Copy link
Contributor

I've closed the associated PR as I no longer believe the overall approach makes sense. Rather than use org.springframework.data.domain.Page, use org.springframework.data.domain.Slice which exposes no page number and is a much better fit for the cosmos model.

Repository owner moved this from Todo to Done in Spring Cloud Azure Mar 24, 2022
@github-actions github-actions bot locked and limited conversation to collaborators Apr 11, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
azure-spring All azure-spring related issues azure-spring-cosmos Spring cosmos related issues. Client This issue points to a problem in the data-plane of the library. customer-reported Issues that are reported by GitHub users external to the Azure organization. feature-request This issue requires a new behavior in the product in order be resolved. needs-team-attention Workflow: This issue needs attention from Azure service team or SDK team
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

7 participants