Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cursor Based Pagination #300

Closed
lberezy opened this issue Nov 7, 2021 · 15 comments · Fixed by #754 or #822
Closed

Cursor Based Pagination #300

lberezy opened this issue Nov 7, 2021 · 15 comments · Fixed by #754 or #822
Assignees
Milestone

Comments

@lberezy
Copy link

lberezy commented Nov 7, 2021

It would be fantastic if SeaORM could support pagination in an efficient manner, namely by using a combination of first/after or last/before count/cursor combinations.

I see that the current implementation uses OFFSET and LIMIT to do pagination, which will work just fine for small tables, but quickly becomes an issue as the size of the table grows. Imagine that an OFFSET grows very large, then this becomes necessarily a lot of work for the database as the entire ordered result set containing OFFSET + LIMIT rows needs to be computed then OFFSET rows skipped over.

This topic has been written about many, many times.

A more efficient way of tackling this problem would be to use a cursor based pagination where the database can leverage its indexes to produce the ordered result set starting from the cursor. e.g. SELECT id, name FROM table WHERE table.id >= $after LIMIT $first;.

This feels like it might be a necessary approach for users wanting to use this ORM at anything resembling scale and would greatly assist implementation for systems that de facto use cursor based pagination (i.e., graphQL).

It's unlikely I'll be able to contribute a PR to implement this unfortunately, but figured I'd raise the issue for anyone else.

@tyt2y3
Copy link
Member

tyt2y3 commented Nov 7, 2021

Thank you for raising this. Yes this knowledge is helpful.

There isn't too many things SeaORM need to do to support cursor based pagination, because you can:

let mut finder = fruit::Entity::find()
    .filter(fruit::Column::key.gt(cursor))
    .order_by_asc(fruit::Column::key);
finder.query().limit(10);
let result = finder.all().await;

I could think of adding a first method:

finder.first(10).await;

which probably could be a nice bit of sugar.

@lberezy
Copy link
Author

lberezy commented Nov 7, 2021

Thanks for that and do agree that it's mostly simple.

Just a question about that example, doesn't finder.query().limit(10); require the use of sea_query directly? It would be very handy to have this sort of thing exposed to be used solely within SeaORM.

Edit: Can just do the following:

sea_orm::QuerySelect::query(&mut finder).limit(10);

@tyt2y3
Copy link
Member

tyt2y3 commented Nov 7, 2021

Yes, you are right. It kinda broke the fluent interface. But it shouldn't bother you, because no additional use is needed.

Anyway, I think adding first and last methods would be nice.

@tyt2y3 tyt2y3 added the good first issue Good for newcomers label Mar 31, 2022
@tyt2y3
Copy link
Member

tyt2y3 commented Mar 31, 2022

I think the implementation of first is pretty straight forward, just add a LIMIT clause.
However, last might need to flip all ORDER BY clauses, LIMIT it and return the resulting Vec in reverse order.

@Sytten
Copy link
Contributor

Sytten commented Mar 31, 2022

Cursor pagination is actually freaking hard to get right specially when you add ordering on arbitrary columns. I wrote a whole internal framework for Caido using sea-query and from memory there are around 10-12 different edge cases you need to consider when you do the full package: before, after, first, last, order (asc/desc). I actually need to write a blog post on that at some point but it is not trivial.

@arpancodes
Copy link

Just created this PR for the first function in the sea-query. Need some guidance for the last function... as I believe that is not possible unless we have a nested query.

@tyt2y3
Copy link
Member

tyt2y3 commented Apr 8, 2022

Thank you for the PR. I think Sytten's suggestion sounds good

However, last might need to flip all ORDER BY clauses, LIMIT it and return the resulting Vec in reverse order.

@Sytten
Copy link
Contributor

Sytten commented Apr 8, 2022

For the simple base cases (that is without special ordering & assuming an ID that is sortable):

  1. First and last cannot be set at the same time
  2. If After is defined: add condition table.id > cursor.id
  3. If Before is defined: add condition table.id < cursor.id
  4. If first is defined: order by ID ASC, limit first + 1
  5. If last is defined: order by ID DESC, limit last + 1
  6. Reverse the result if last is defined
  7. Set has_next_page to true if first is defined and len(result) > first, cut last element if that is the case
  8. Set has_previous_page to true if last is defined and len(result) > last, cut first element if that is the case (since it was reversed at step 6)

Happy to help if you want to add ordering in the mix but it becomes exponentially more difficult even with one parameter.

@billy1624
Copy link
Member

I imagine the API to be...

let mut cursor = fruit::Entity::find().cursor();

let res = cursor
    .after(1) // Filter by primary key > 1
    .first(100) // Sort by primary key in descending order
    .all(db);

let res = cursor
    .before((10, 12)) // Filter by composite primary key < (10, 12)
    .last(5) // Sort by composite primary key in ascending order
    .all(db);

To keep things simple, here I only allow primary key to be used as cursor and sort by primary key only.

@tyt2y3 tyt2y3 removed the good first issue Good for newcomers label May 15, 2022
@tyt2y3 tyt2y3 added this to the 0.9.0 milestone May 15, 2022
@billy1624 billy1624 moved this to Triage in SeaQL Dev Tracker May 18, 2022
@billy1624 billy1624 moved this from Triage to In Progress in SeaQL Dev Tracker May 19, 2022
@billy1624 billy1624 self-assigned this May 20, 2022
@billy1624 billy1624 mentioned this issue May 20, 2022
1 task
@billy1624
Copy link
Member

Hey everyone, please check #754

@billy1624 billy1624 moved this from In Progress to Review in SeaQL Dev Tracker May 20, 2022
@billy1624 billy1624 reopened this Jun 27, 2022
Repository owner moved this from Review to Triage in SeaQL Dev Tracker Jun 27, 2022
@billy1624 billy1624 linked a pull request Jun 27, 2022 that will close this issue
@billy1624 billy1624 moved this from Triage to In Progress in SeaQL Dev Tracker Jun 28, 2022
@billy1624 billy1624 moved this from In Progress to Review in SeaQL Dev Tracker Jun 28, 2022
@billy1624 billy1624 moved this from Review to Done in SeaQL Dev Tracker Jul 12, 2022
@jacob-pro
Copy link

jacob-pro commented Nov 6, 2022

I'm confused about how composite key pagination works.

For example lets say you have a table like:

ID Category
1 B
2 B
3 A
4 C
5 A
6 A
7 C
8 B
9 A

Now you want to build an API that allows sorting by category. Category isn't unique so in order to page you need to combine with the ID column into a composite key.

Lets query the first 3 results:

SELECT * FROM example
ORDER BY category ASC, id ASC
LIMIT 3

You get back results:

ID Category
3 A
5 A
6 A

Now we want to get the next page, according to the current implementation this creates a query like:

SELECT * FROM example
WHERE category > 'A' AND id > 6
ORDER BY category ASC, id ASC
LIMIT 3

But this is surely wrong - we would end up only getting two more results, missing out many inbetween

ID Category
8 B
7 C

Should we not actually have a query like:

SELECT * FROM example
WHERE (category = 'A' and id > 6) OR category > 'A'
ORDER BY category ASC, id ASC
LIMIT 3

which produces:

ID Category
9 A
1 B
2 B

@tyt2y3
Copy link
Member

tyt2y3 commented Nov 7, 2022

Thank you for the detailed report with example.

Can you open a new issue?

Also, what do you think would be the case as in composite key composed of 3 columns? I imagine by induction the first two columns would be "= or >" , or will there be more complications?

@tyt2y3 tyt2y3 changed the title Efficient Pagination Cursor Based Pagination Nov 7, 2022
@Sytten
Copy link
Contributor

Sytten commented Nov 7, 2022

Composite key become exponentially worse, I suggest taking a look at the algorithm prisma uses for cursor pagination in their rust engine. It supports quite a bit of things.

@jacob-pro
Copy link

@Sytten I am not an expert on this topic... but I was under the impression the performance was acceptable if used with a composite index?

https://www.postgresql.org/docs/current/indexes-multicolumn.html

https://docs.oracle.com/en/database/oracle/oracle-database/19/cncpt/indexes-and-index-organized-tables.html#GUID-ABE1DE2A-59CC-4ADE-86A5-426B16459464

@Sytten
Copy link
Contributor

Sytten commented Nov 7, 2022

Not harder in terms of computation on the database side, mostly in terms of implementing a generic algorithm for it. Even more so if you want to order said collection.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
6 participants