[5.4] Redis Queue: Fix Job having timeout set different from the one of the queue #18919

Belphemur · 2017-04-24T15:15:49Z

In the previous version, laravel added the possibility to have each job set their own timeout independently of the queue timeout.

This feature hasn't been implemented in the redis queue and leads to a duplication of job if used.

Scenario

Queue timeout: 60 seconds
Job Timeout: 300 Seconds
Worker on queue: 2

Current way the queue works

Before taking a job from the Redis queue, it migrates the queue:reserved. This check is done by verifying if any score in the reserved queue is lower than the current timestamp. Any job reaching this condition is put back in the working queue.

After the migration, the worker takes the job, assign in the reserved queue and set it's score as NOW()+ timeout of the queue.

Problem

The job with a timeout of 300 is taken by a worker with a timeout of 60. The score in reserved will be NOW() + 60 sec. The second worker come along, the job hasn't finished yet (by its 300 timeout) but reached the queue timeout. The second worker then takes the job. The job is then duplicated in the queue.

Solution

This PR verify if a job has a timeout set when poping it from the queue and putting it in the reserved queue. If it's the case, as per documentation, that timeout as precedence on the one of the queue.

Since a job can have a timeout set that is different from the queue, we can't rely only of the timeout of the queue to set the score of the job in the reserved queue. It needs to reflect the real timeout of the job.

m1guelpf · 2017-04-24T16:34:17Z

Seems to be related to #16257

Belphemur · 2017-04-24T16:35:28Z

@m1guelpf yes it's directly related to it. I can't finish the redis part correctly, I keep getting weird NIL error.

Maybe @taylorotwell would have a better idea how to resolve this.

m1guelpf · 2017-04-24T17:48:55Z

@Belphemur Could you fix the tests and add a new one to prevent things like this from happening in the future?

Test working

Belphemur · 2017-04-24T18:34:45Z

@m1guelpf I finally got the LUA part right and I added a new test to prove it works

tillkruss · 2017-04-24T18:39:59Z

src/Illuminate/Queue/LuaScripts.php


 if(job ~= false) then
    -- Increment the attempt count and place job on the reserved queue...
-    reserved = cjson.decode(job)
+    reserved = cjson.decode(job)    


Can you make sure you don't have extra white space at all line endings?

Belphemur · 2017-04-24T18:43:50Z

@tillkruss done, thanks for spotting it

themsaid · 2017-04-24T19:26:59Z

Quoting the docs:

The --timeout value should always be at least several seconds shorter than your retry_after configuration value. This will ensure that a worker processing a given job is always killed before the job is retried. If your --timeout option is longer than your retry_after configuration value, your jobs may be processed twice.

The behaviour you described is not redis-specific, if you're using any queue driver your jobs will be processed multiple times if the timeout value is greater than the retry_after value.

Belphemur · 2017-04-24T19:43:08Z

I don't understand it's usage. It feels like we have then 2 differents way to set a timeout for a job, only one of them is going to kill it and the other will duplicate it.

Do you have an example of a job you'd like to duplicate because it hasn't finished in the given amount of time instead of killing it and restarting it?

Edit: re-read upgrade guide. Rename and explanation present in 5.3. Removed that part of the comment.

themsaid · 2017-04-24T20:00:56Z

@Belphemur that configuration is there since the beginning, but it was renamed to retry_after by the release of 5.3 and that change is documented.

Belphemur · 2017-04-24T20:04:06Z

@themsaid I apologise, I edited my comment. With the timeout that can be set in the job, the retry_after get a little bit dangerous. It's easy to forget it and set a timeout in a job that is way bigger than the retry_after.

taylorotwell · 2017-04-24T20:37:06Z

Closing since this is not a bug as explained by @themsaid. It's true that your retry_after must always be longer than your longest timeout.

Antoine Aflalo added 2 commits April 24, 2017 11:15

Fix Job having timeout set different from the one of the queue

02e060f

Since a job can have a timeout set that is different from the queue, we can't rely only of the timeout of the queue to set the score of the job in the reserved queue. It needs to reflect the real timeout of the job.

Fix arithmetic problem with string/integer

12212f0

Belphemur force-pushed the redis-queue-job-timeout branch from 9505f4d to 12212f0 Compare April 24, 2017 15:41

Fix Nil problem and add readability

ca1f786

Belphemur force-pushed the redis-queue-job-timeout branch from 31da5e1 to ca1f786 Compare April 24, 2017 16:20

Antoine Aflalo added 2 commits April 24, 2017 13:31

Update documentation

97ee8dd

Fix the lua to take into account a possible timeout of Nil

8257204

Fix the LUA logic.

6e301a6

Test working

Belphemur force-pushed the redis-queue-job-timeout branch from 49c4d87 to 661251f Compare April 24, 2017 18:33

Add test to check that the timeout of the job is respected

822dc45

Belphemur force-pushed the redis-queue-job-timeout branch from 661251f to 822dc45 Compare April 24, 2017 18:34

tillkruss reviewed Apr 24, 2017

View reviewed changes

Fix added spaces

d712377

taylorotwell closed this Apr 24, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[5.4] Redis Queue: Fix Job having timeout set different from the one of the queue #18919

[5.4] Redis Queue: Fix Job having timeout set different from the one of the queue #18919

Belphemur commented Apr 24, 2017

m1guelpf commented Apr 24, 2017

Belphemur commented Apr 24, 2017

m1guelpf commented Apr 24, 2017 •

edited

Loading

Belphemur commented Apr 24, 2017 •

edited

Loading

tillkruss Apr 24, 2017

Belphemur commented Apr 24, 2017

themsaid commented Apr 24, 2017 •

edited

Loading

Belphemur commented Apr 24, 2017 •

edited

Loading

themsaid commented Apr 24, 2017

Belphemur commented Apr 24, 2017

taylorotwell commented Apr 24, 2017

[5.4] Redis Queue: Fix Job having timeout set different from the one of the queue #18919

[5.4] Redis Queue: Fix Job having timeout set different from the one of the queue #18919

Conversation

Belphemur commented Apr 24, 2017

Scenario

Current way the queue works

Problem

Solution

m1guelpf commented Apr 24, 2017

Belphemur commented Apr 24, 2017

m1guelpf commented Apr 24, 2017 • edited Loading

Belphemur commented Apr 24, 2017 • edited Loading

tillkruss Apr 24, 2017

Choose a reason for hiding this comment

Belphemur commented Apr 24, 2017

themsaid commented Apr 24, 2017 • edited Loading

Belphemur commented Apr 24, 2017 • edited Loading

themsaid commented Apr 24, 2017

Belphemur commented Apr 24, 2017

taylorotwell commented Apr 24, 2017

m1guelpf commented Apr 24, 2017 •

edited

Loading

Belphemur commented Apr 24, 2017 •

edited

Loading

themsaid commented Apr 24, 2017 •

edited

Loading

Belphemur commented Apr 24, 2017 •

edited

Loading