Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cursor based iteration #852

Closed
sj26 opened this issue Jul 13, 2023 · 6 comments
Closed

Cursor based iteration #852

sj26 opened this issue Jul 13, 2023 · 6 comments

Comments

@sj26
Copy link
Contributor

sj26 commented Jul 13, 2023

Maintenance tasks has been amazing for processing backfills etc across whole tables for us. But we often have tasks that need to operate on a subset of records, too. But ActiveRecord batch enumeration is forced to walk by primary key, and that tends to plan and execute poorly on large tables when the filter conditions address <~50% of the rows.

I would love to use arbitrary cursor-based pagination, not restricted to the primary key. With the landing of string cursors (#339; amazing, thank you!) and job-iteration's support for active record cursor enumeration, is the appetite to pursue cursor-based and/or custom enumeration strategies? Can we contribute to these efforts?

@sj26
Copy link
Contributor Author

sj26 commented Jul 18, 2023

(This might be related to #207.)

@etiennebarrie
Copy link
Member

How do you think you would declare a collection to be iterated with cursor enumeration? Currently we have ActiveRecord::Relation => enumerator_builder.active_record_on_records => JobIteration::ActiveRecordEnumerator => JobIteration::ActiveRecordCursor
or
ActiveRecord::Batches::BatchEnumerator => enumerator_builder.active_record_on_batch_relations => JobIteration::ActiveRecordBatchEnumerator.

What kind of object would we return from collection? I guess we'd need to use a class method and the collection builder strategy to handle a ActiveRecord::Relation differently?

Copy link

This issue has been marked as stale because it has not been commented on in two months.
Please reply in order to keep the issue open. Otherwise, it will close in 14 days.
Thank you for contributing!

@github-actions github-actions bot added the stale label Jan 25, 2024
@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Feb 9, 2024
@nvasilevski nvasilevski removed the stale label Feb 9, 2024
@nvasilevski
Copy link
Contributor

I think this must have been addressed by #859

Once the PR mentioned above gets released, maintenance tasks will be able to specify any column(s) to be used as a cursor. It's important to note that the cursor should represent a unique combination for job-iteration to work properly

@sj26 Let us know if that's something you were looking for! Thanks

@nvasilevski nvasilevski reopened this Feb 9, 2024
@gmcgibbon
Copy link
Member

https://github.com/Shopify/maintenance_tasks/releases/tag/v2.6.0 includes that change.

Can this be closed?

@nvasilevski
Copy link
Contributor

I think it's safe to close. Feel free to reopen if released functionality is not what was initially requested. Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants