Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ordering of paginated /v3/roles is inconsistent #2249

Closed
sweinstein22 opened this issue Apr 27, 2021 · 6 comments
Closed

Ordering of paginated /v3/roles is inconsistent #2249

sweinstein22 opened this issue Apr 27, 2021 · 6 comments

Comments

@sweinstein22
Copy link
Contributor

sweinstein22 commented Apr 27, 2021

Thanks for submitting an issue to cloud_controller_ng. We are always trying to improve! To help us, please fill out the following template.

Issue

When paging through /v3/roles, users have found inconsistencies in the results returned.

Context

The v3 roles endpoint is backed by a join across several tables, so the default pagination behavior of falling back on the id doesn't paginate consistently as ids can be duplicated.

Another thing to consider is that CC only stores timestamps to the second, so if roles are created within the same second ordering by created_ats may result in inconsistent ordering as well. This is unavoidable unless CC starts storing more granular timestamps, but we should be aware of this edge case when considering solutions.

From Felisia Martini: I think we saw a similar issue happening for service credential bindings during development. That is also an endpoint that joins two tables and the default pagination by id (which was also overriding the user defined order by) is not good enough because the same id will exist in several tables. We did however rely on created_at as a more deterministic ordering parameter, but looking at this issue that might have been just a temporary fix. Anyways, this is what we did: 309e311 and the different binding tables are joined in a View in this case. Also, when checking SQL statement we noticed that any ordering applied in the view will be overriden by the paginator order_by later.

Steps to Reproduce

  • Create a foundation with more than a given per_page query param
  • Query for pages of roles, note that the results on each page are not consistent and may contain duplicates

for the created_ats concern:

  • Create multiple roles within the same second
  • See that the ordering of those roles in results is inconsistent

Expected result

Consistent default pagination order (when not using an order_by query parameter) for the /v3/roles endpoint

Current result

The intent that surfaced this bug was to compile a complete list of all the roles, but inconsistent ordering inhibits that list creation. For example if two roles are created within the same second, when curling page 1 role A may be on page 1 and role B on page 2, but when curling page 2 entry A and B may swap order and role A would be seen again on page 2. The result being role A would appear twice in the compiled list, and role B wouldn't appear at all.

Results come back in varying order, the following are two separate runs of a script against the same CAPI:

cf curl /v3/roles?order_by=%2Bcreated_at&page=1&per_page=3
Number of pages 8
Number of results 23
cf curl /v3/roles?order_by=%2Bcreated_at&page=2&per_page=3
cf curl /v3/roles?order_by=%2Bcreated_at&page=3&per_page=3
cf curl /v3/roles?order_by=%2Bcreated_at&page=4&per_page=3
cf curl /v3/roles?order_by=%2Bcreated_at&page=5&per_page=3
cf curl /v3/roles?order_by=%2Bcreated_at&page=6&per_page=3
cf curl /v3/roles?order_by=%2Bcreated_at&page=7&per_page=3
cf curl /v3/roles?order_by=%2Bcreated_at&page=8&per_page=3
Total roles  23
Unique roles  22
cf curl /v3/roles?order_by=%2Bcreated_at&page=1&per_page=3
Number of pages 8
Number of results 23
cf curl /v3/roles?order_by=%2Bcreated_at&page=2&per_page=3
cf curl /v3/roles?order_by=%2Bcreated_at&page=3&per_page=3
cf curl /v3/roles?order_by=%2Bcreated_at&page=4&per_page=3
cf curl /v3/roles?order_by=%2Bcreated_at&page=5&per_page=3
cf curl /v3/roles?order_by=%2Bcreated_at&page=6&per_page=3
cf curl /v3/roles?order_by=%2Bcreated_at&page=7&per_page=3
cf curl /v3/roles?order_by=%2Bcreated_at&page=8&per_page=3
Total roles  23
Unique roles  20

Possible Fix

As a first step we could consider implementing something along the lines of what was done for service credential bindings. This doesn't address the issue of having multiple entries with the same created_ats timestamps, but at least should reduce some volatility. We could also explore other ways to order, by guid or some other field that is consistently unique.

@sethboyles
Copy link
Member

It's probably not a bad idea to always use guid as a secondary order field to ensure consistency, no matter the primary field being sorted on

@reidmit
Copy link
Contributor

reidmit commented Apr 29, 2021

For some context, I believe this line was an attempt to deal with this when we implemented it, but I guess it didn't work as we intended (or we missed something):

params['order_by'] ||= 'created_at'

We wanted to default ordering of v3 roles to be by created_at rather than id.

@klahti-pivotal
Copy link

Seeing similar issues when running this call to generate a report. Environments with more than 5000 roles must use pagination in order to process all so a consistent ordering is necessary to accurately collect this data.

@JenGoldstrich
Copy link
Contributor

Sarah and I pushed the start of a solution to WIP-order-on-guid, just to document where that work is. We have implemented the fix and have some test clean up and broken roles tests to fix.

JenGoldstrich pushed a commit that referenced this issue May 11, 2021
This change resolves an issue with inconsistent ordering that CAPI users
were experiencing when resources were created within the same second as
each other.  Resources that have a GUID field in the CAPI DB will now
order by GUID as a secondary order parameter, which should cause the
ordering of resources to be consistent.

#2249

Co-authored-by: Jenna Goldstrich <[email protected]>
Co-authored-by: Sarah Weinstein <[email protected]>
@sweinstein22
Copy link
Contributor Author

We have run some performance testing before and after making the change. We ran time curl /v3/apps?per_page=5000 against a boshlite with 11,000 apps, created via the boshlite's ruby console. It does not seem to have caused a significant change in performance.

Before change:

real    0m23.878s
user    0m5.186s
sys     0m0.563s

After change:

real    0m24.779s
user    0m5.113s
sys     0m0.531s

@tjvman
Copy link
Contributor

tjvman commented May 25, 2021

Fix shipped in capi-release 1.111.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants