purge_old_records: add --max-offset option and some additional logging statements #169
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
r? - @rfk
I was working with the purge task in production to get it to run more effectively, and noticed a couple of things.
the default max-per-loop of 10 meant that each task was spending a large chunk of each cycle doing the select, and spinning a lot of database cpu. So I'll be using a max-per-loop of ~1000 going forward.
Due to the current select for old_records, each purge task winds up working on the same set of old_records. So this PR adds a --max-offset option to the task. The idea is that if each instance's task picks a random offset < max_offset, then it will be unlikely to have any 2 instances working on the same set of records. I'm thinking of running something like max-per-loop=1000 and max-offset=100000. The offset will add some additional cost to each select, so I'll have to play with those numbers to get the right tradeoff. But one thing I'm unclear about: the current query does
order by replaced_at desc, uid desc
. Is there some importance to that ordering? I.e., if I'm skipping to the middle of that list with an offset does that break some expectation?This PR also adds a couple of logging statements I was interested in being able to see.
(Note: one gotcha with
--max-offset
is that if the number of purgeable rows is << max_offset, then no rows will be purged. I didn't try to defend against that in this PR. When the backlog is down to a manageable level I'll change max_offset to be zero (or maybe 10000 to keep a little bit of scatter). Or maybe not change it all: enough instances will get work to do by chance to keep to backlog under control).