Replies: 2 comments 2 replies
-
Hmm, forgive me if I am misunderstanding here. I would design this slightly differently because I don't think the queue should be "holding" jobs for some later logic; it should be performing them asap. GoodJob also doesn't have any batch functionality right now; it is entirely ActiveJob semantics for performing jobs. I would design this with something like this: # When you want a user to be reindexed
some_user.touch(:needs_to_be_indexed_at)
class BatchUserIndexJob < ApplicationJob
def perform
return if User.where.not(needs_to_be_indexed_at: nil).count < 10
User.where.not(needs_to_be_indexed_at: nil).first(10).each do |user|
UserIndexJob.perform_later(user) # <= queue it or maybe just do it right here in the loop?
user.update(needs_to_be_indexed_at: nil)
end
end
end
# and then queue BatchUserIndexJob every 30 seconds using GoodJob's cron or some external cron system |
Beta Was this translation helpful? Give feedback.
-
The issue with using a cron is that you need to add a field to each of the tables you want to handle this way, and it needs to be indexed as well to your queries are not too slow. It will also have problems with efficiently processing large number of jobs, for example if we do a re-index on 300 000 users, there will be one job processing them in batch, then 2 jobs after 30 seconds, then 3 jobs after 60 seconds… With other queuing systems, you "simply" reserve X jobs, then process them at the same time once you either hit the count limit or the configured timeout, and then mark them all as done. I also understand it can require to move to far away from ActiveJob semantics (a |
Beta Was this translation helpful? Give feedback.
-
We are currently considering switching to
good_job
for our job processing, and I am wondering if consuming jobs in batch is doable.The use-case: we have multiple jobs that could benefit batching the writes to a storage (events to an analytics storage, re-indexing of some records in our search system).
Ideally, we would like to say that a specific job (
ReindexUserJob
) should be consumed by 10, with a timeout of 30 seconds.This would mean that a method on the job (
perform_bulk
?) would be called with a maximum of 10 jobs in arguments, but can be called with less than 10 jobs if we did not reach 10 queues job in the last 30 seconds. If the method fails, then all jobs are put back in the queue to be reprocessed later.What do you think about it? Is it doable without too much pain in
good_job
?Beta Was this translation helpful? Give feedback.
All reactions