-
-
Notifications
You must be signed in to change notification settings - Fork 279
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question about until_timeout with 6.0.0 #303
Comments
That is correct |
Unless you have any other questions regarding this configuration I am going to go ahead and close this issue? |
I can't actually get things to work that way in real life. Here's my repro. In a rails app, I added: class TestWorker
include Sidekiq::Worker
sidekiq_options retry: false,
lock: :until_expired,
lock_expiration: 3
log_duplicate_payload: true
def perform(args)
warn "executed with #{args}"
end
end Then I ran # sleep first so that map results in job id's or nil if duplicate
(1..10).map { sleep 1; TestWorker.perform_async(id: 1) } Per my above post, I would have expected the output to be ["<jid>", nil, nil nil, "<jid>", nil, nil, nil, "<jid>", nil] (or approximately... I'm not concerned about sleep being exactly 1 second or the lock expiration being exactly 3 seconds... that's not what this is about). However, the actual output is: => ["8e2a94c0ebe484929f7cf7a2", nil, nil, nil, nil, nil, nil, nil, nil, nil] (plus a bunch of dupe job warnings). It looks like even though Additionally, if you re-run that command again (without clearing redis or anything) the output is 10 dupes and no queued jobs: [nil, nil, nil, nil, nil, nil, nil, nil, nil, nil] The dupe job warnings all look like
What's going wrong here? Why isn't the job being requeued after the |
Another experiment. I ["4dbf58a9ff19a1a37f98f565",
"d3cdd44f0cc4ebf57a46bd2f",
"b28da306a1a55acef69c0f82",
"c6f10f3ab82a30d64735029a",
"44e590da4bd7d838f35bb615",
"68d93c503bac841d053ed920",
"45cd8d76765c1dedecc0295a",
"0ad2a3de2535e4cb2d920a5d",
"9b5de08bb02d15a8b58f5d23",
"84a58a71b4e0073a75680bad"] 10 successfully executed jobs, when they were spaced out so as to not hit It's almost like attempting a job during |
@mhenrixon any thoughts? |
I’ll get back to you a little later. Currently dealing with some family business. |
@aharpervc found the bug. The UntilExpired was missing some coverage unfortunately, I'll have it fixed shortly. |
I updated to 6.0.2 but I'm still seeing the same output in my above test case... the first job is executed but all subsequent jobs are marked as dupes until flushing redis. Are you getting something different? |
I don't see that problem but I also don't attempt to use your [14] pry(main)> begin
[14] pry(main)* p SimpleWorker.perform_async(1)
[14] pry(main)* sleep 3
[14] pry(main)* p SimpleWorker.perform_async(1)
[14] pry(main)* end
"873501e719be4e482ec2e37c"
"3345f34fc92de6fea6ed57a8"
class SimpleWorker
include Sidekiq::Worker
sidekiq_options lock: :until_executed,
queue: :default,
lock_expiration: 3,
on_conflict: :log,
unique_args: (lambda do |args|
[args.first]
end)
def perform(some_args)
Sidekiq::Logging.with_context(self.class.name) do
SidekiqUniqueJobs.logger.debug { "#{__method__}(#{some_args})" }
end
sleep 1
end
end |
Maybe I'm missing something. Your worker is using In any case, I am interested in finding out what the proper options are to throttle job execution (by their args) so that they do not run more often than the configured time period (1 minute in my original post, 3 seconds in our example workers). Based on your example this is what I tried: # simple_until_expired_worker.rb
class SimpleUntilExpiredWorker
include Sidekiq::Worker
sidekiq_options lock: :until_expired,
queue: :default,
lock_expiration: 3,
on_conflict: :log,
unique_args: (lambda do |args|
[args.first]
end)
def perform(some_args)
Sidekiq::Logging.with_context(self.class.name) do
SidekiqUniqueJobs.logger.debug { "#{__method__}(#{some_args})" }
end
end
end ran this in
and sidekiq output:
This is even weirder because while you do get 5 "skipping job" messages, sidekiq shows all 6 executed. And what I had hoped for was: execute, skip, skip, execute, skip, skip. Any ideas? |
I made a mistake but I'll leave it rather than editing it; I thought that Here's the sidekiq log with
I think that means it's doing: execute, skip, skip, skip, skip, skip. But there should be an execution after the lock has expired at 3 seconds so this still seems weird. |
Sorry if I confused you with The lock gets removed from the server process after the expiration in the case of an expiring job. It gets removed when the job is successfully executed in the case of until_executed. Could you remove your unique argument configuration and see if that helps? I think the first argument might be something you are not interested in. Unfortunately the sleep isn't working while testing but if you count to 3 and you push your job again or you time it it will work. Keep in mind that sleep is not a real world example. For some reason it messes with you. If you just wait 3 seconds it will work. Unfortunately I don't have an answer for why that is. |
Okay, I have a new example that does not use Not using unique args: # simple_until_expired_worker2.rb
class SimpleUntilExpiredWorker2
include Sidekiq::Worker
sidekiq_options lock: :until_expired,
queue: :default,
lock_expiration: 3,
on_conflict: :log
def perform(some_args)
warn "executed #{jid} with #{some_args}"
end
end In
sidekiq:
The output is the same: execute, skip, skip, skip, skip, skip. I'm not super concerned about precise millisecond expirations, but hopefully you agree that at least job I can tell that it can unlock, so I recognize that. Here's another test where I ran 3 jobs quickly, waited a few seconds, then ran it one more time. I expected: execute, skip, skip, execute and that is what happened.
It's almost like the lock is being taken out even when a job is a duplicate, and then released on schedule. That could explain why a repeated cadence of once per second is continually skipping execution, whereas explicitly waiting until the expiration window has passed will successfully execute the job. |
The weird thing is that expiration takes place in Redis and should have nothing to do with the sleep in the ruby layer. I am honestly baffled by the result of that. As far as I know with my pretty extensive experience the first I'll do some more digging on this one... By the way, for this lock it would probably make sense to only add the expiration after the job is completed? Right now it will expire 3 seconds from when it is created but that means that potentially there might be duplicates running if it takes more than 3 seconds to complete running the job. |
You are totally right! A subsequent call with the same arguments prolongs the expiration. I'll see about getting that fixed @aharpervc! Can't thank you enough for you detailed investigation ❤️ |
With the changes in #316 I now get the following: [2] pry(main)> 10.times { sleep 1; p SimpleWorker.perform_async(1) }
"a05cd38422dea4fa5b2162ae"
nil
nil
"81732778eff43d8f57c54f3b"
nil
nil
"6291910a2e9e295784f8363a"
nil
nil
"20b29375fa36bcc0bc2939e0"
=> 10 |
Will release |
Alright @aharpervc I released a version with the fix. You are looking for version |
Huge, I'll give it a look soon, thanks for working on this. e: Yep, looks good to me, thanks 💯 |
* Allow keys to be expired After calling PERSIST in LUA expiring a key does not EXPIRE. This should close #mhenrixon#303 * Delete the right key from the unique keys set * Adds coverage for previous commit * Fix displaying lists more than 100 digests * Fix README * Adds super naive pagination * Rubo👮
Formerly with 5.0.10 I had a job that did this:
It's a little unclear but looking at the updated readme & pr's, the correct options in 6.0.0 is:
The goal is to have a job that will execute not more often than once a minute, given identical args. Eg,
Is that correct?
The text was updated successfully, but these errors were encountered: