Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Redis 5 API for multi/pipelined commands changes transaction behavior in resque-scheduler 4.9+ #787

Open
ajvondrak opened this issue Dec 27, 2023 · 3 comments

Comments

@ajvondrak
Copy link

ajvondrak commented Dec 27, 2023

When upgrading to resque-scheduler 4.9.0 and above with redis-rb 4.x, we still get the deprecation notices related to resque/resque#1794 and #745.

Repro:

require "bundler/inline"

gemfile do
  source "https://rubygems.org"
  gem "resque", "~> 2.6"
  gem "resque-scheduler", "~> 4.9"
  gem "redis", "~> 4.0"
end

Resque.redis = Redis.new # NOTE: need a fresh redis-server running locally

class Job
  @queue = "jobs"

  def self.perform
    puts "performed job"
  end
end

Resque.enqueue_in(1, Job)
sleep(1)
Resque::Scheduler.handle_delayed_items
puts "enqueued delayed items"
Resque::Worker.new("jobs").work(0)
$ ruby repro.rb
resque-scheduler: [INFO] 2023-12-27T11:11:35-08:00: Processing Delayed Items
Redis#srem will always return an Integer in Redis 5.0.0. Use Redis#srem? instead.(called from: /Users/alex.vondrak/.local/share/rtx/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/redis-namespace-1.11.0/lib/redis/namespace.rb:564:in `wrapped_send')
Pipelining commands on a Redis instance is deprecated and will be removed in Redis 5.0.0.

redis.multi do
  redis.get("key")
end

should be replaced by

redis.multi do |pipeline|
  pipeline.get("key")
end

(called from /Users/alex.vondrak/.local/share/rtx/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/redis-namespace-1.11.0/lib/redis/namespace.rb:564:in `wrapped_send'}
enqueued delayed items
performed job

After enough poking around through stack traces, I believe I've discovered the issue, and it's a subtle interaction between the resque & resque-scheduler gems:

  1. resque-scheduler calls Resque.redis.multi here:
    Resque.redis.multi do |pipeline|
    encoded_jobs_to_requeue.each do |encoded_job|
    pipeline.srem("timestamps:#{encoded_job}", timestamp_bucket_key)
    decoded_job = Resque.decode(encoded_job)
    enqueue(decoded_job)
    end
  2. through a chain of calls, this hits Resque::Job.create:
    Resque::Job.create(queue, job_klass, *params)
  3. resque defines Job.create in such a way that we effectively make a call to Resque.redis.pipelined: https://github.com/resque/resque/blob/2f9d080ce86eb2e3f1f3d47599a21c576124c6f3/lib/resque/data_store.rb#L105-L108

Copying the shape of the above into a direct repro, we're basically doing

# Causes deprecation warning
Resque.redis.multi do |m|
  Resque.redis.pipelined do |p|
    # ...
  end
end

instead of what the redis gem wants, which is

# Does not cause deprecation warning
Resque.redis.multi do |m|
  m.pipelined do |p|
    # ...
  end
end

It might be hard to actually achieve the latter, though, since the code is straddling two gems. 😕

We were not getting this warning prior to resque-scheduler 4.9 since the outer multi was added by #767 (which also causes other nested transaction issues, like in #773).

It seems like this still works circa redis 5.x. At least when switching the version in the above repro, I get:

$ ruby repro.rb
resque-scheduler: [INFO] 2023-12-27T11:14:06-08:00: Processing Delayed Items
enqueued delayed items
performed job

I think the deprecation may just kind of be firing in a wonky way due to how redis 4.x tries to juggle state.

Circa 4.x:

Circa 5.x:

However, it's unclear to me what the interaction will be between two "top level" calls to these transaction commands. I.e., will the futures fire with the desired timing, or do they go out of sync in some way that may cause an issue? EDIT: per investigation below, it seems that the "nested" pipeline actually silently fires outside of the new MULTI/EXEC transaction around the batch; see #787 (comment) for details.

@ajvondrak
Copy link
Author

However, it's unclear to me what the interaction will be between two "top level" calls to these transaction commands. I.e., will the futures fire with the desired timing, or do they go out of sync in some way that may cause an issue?

I think I figured this part out by adding some Redis logging to the repro script:

repro.rb
require "bundler/inline"

gemfile do
  source "https://rubygems.org"
  gem "resque", "~> 2.6"
  gem "resque-scheduler", "~> 4.9"
  gem "redis", "~> #{ENV["REDIS_VERSION"]}.0"
end

case ENV["REDIS_VERSION"]
when "4"
  $stdout.sync = true
  logger = Logger.new($stdout, level: Logger::DEBUG)
  Resque.redis = Redis.new(logger: logger)
when "5"
  RedisClient.register(Module.new do
    def call(command, ...)
      puts "[call] #{command.flatten.join(" ")}"
      super
    end

    def call_pipelined(commands, ...)
      puts "[call_pipelined] #{commands.flatten.join(" ")}"
      super
    end
  end)
  Resque.redis = Redis.new
end

class Job
  @queue = "jobs"
end

Resque.redis.redis.flushdb # to ensure we start with a blank slate
Resque.enqueue_in(1, Job)
sleep(1)
Resque::Scheduler.handle_delayed_items
Redis 4
$ REDIS_VERSION=4 ruby repro.rb
D, [2023-12-27T16:17:24.233121 #49036] DEBUG -- : [Redis] command=FLUSHDB args=
D, [2023-12-27T16:17:24.234243 #49036] DEBUG -- : [Redis] call_time=0.80 ms
D, [2023-12-27T16:17:24.234376 #49036] DEBUG -- : [Redis] command=RPUSH args="resque:delayed:1703722645" "{\"class\":\"Job\",\"args\":[],\"queue\":\"jobs\"}"
D, [2023-12-27T16:17:24.234501 #49036] DEBUG -- : [Redis] call_time=0.11 ms
D, [2023-12-27T16:17:24.234597 #49036] DEBUG -- : [Redis] command=SADD args="resque:timestamps:{\"class\":\"Job\",\"args\":[],\"queue\":\"jobs\"}" ["delayed:1703722645"]
D, [2023-12-27T16:17:24.234710 #49036] DEBUG -- : [Redis] call_time=0.10 ms
D, [2023-12-27T16:17:24.234730 #49036] DEBUG -- : [Redis] command=ZADD args="resque:delayed_queue_schedule" 1703722645 1703722645
D, [2023-12-27T16:17:24.234850 #49036] DEBUG -- : [Redis] call_time=0.11 ms
D, [2023-12-27T16:17:25.238398 #49036] DEBUG -- : [Redis] command=ZRANGEBYSCORE args="resque:delayed_queue_schedule" "-inf" 1703722645 "LIMIT" 0 1
D, [2023-12-27T16:17:25.241675 #49036] DEBUG -- : [Redis] call_time=2.68 ms
resque-scheduler: [INFO] 2023-12-27T16:17:25-08:00: Processing Delayed Items
D, [2023-12-27T16:17:25.245827 #49036] DEBUG -- : [Redis] command=INFO args=
D, [2023-12-27T16:17:25.246626 #49036] DEBUG -- : [Redis] call_time=0.69 ms
D, [2023-12-27T16:17:25.255685 #49036] DEBUG -- : [Redis] command=SCRIPT args="load" "if redis.call('SETNX', KEYS[1], ARGV[1]) == 1\nthen\n  redis.call('EXPIRE', KEYS[1], 180)\n  return 1\nelse\n  return 0\nend\n"
D, [2023-12-27T16:17:25.256571 #49036] DEBUG -- : [Redis] call_time=0.74 ms
D, [2023-12-27T16:17:25.256740 #49036] DEBUG -- : [Redis] command=EVALSHA args="8f4a4e422bfa8f9a0931e350d6e30b1c3ce97c33" 1 "resque:resque:resque_scheduler_master_lock" "localhost:49036"
D, [2023-12-27T16:17:25.257156 #49036] DEBUG -- : [Redis] call_time=0.38 ms
D, [2023-12-27T16:17:25.257338 #49036] DEBUG -- : [Redis] command=LRANGE args="resque:delayed:1703722645" 0 99
D, [2023-12-27T16:17:25.257677 #49036] DEBUG -- : [Redis] call_time=0.30 ms
D, [2023-12-27T16:17:25.257812 #49036] DEBUG -- : [Redis] command=WATCH args="resque:delayed:1703722645"
D, [2023-12-27T16:17:25.258041 #49036] DEBUG -- : [Redis] call_time=0.20 ms
Redis#srem will always return an Integer in Redis 5.0.0. Use Redis#srem? instead.(called from: /Users/alex.vondrak/.local/share/rtx/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/redis-namespace-1.11.0/lib/redis/namespace.rb:564:in `wrapped_send')
Pipelining commands on a Redis instance is deprecated and will be removed in Redis 5.0.0.

redis.multi do
  redis.get("key")
end

should be replaced by

redis.multi do |pipeline|
  pipeline.get("key")
end

(called from /Users/alex.vondrak/.local/share/rtx/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/redis-namespace-1.11.0/lib/redis/namespace.rb:564:in `wrapped_send'}
D, [2023-12-27T16:17:25.262883 #49036] DEBUG -- : [Redis] command=MULTI args=
D, [2023-12-27T16:17:25.262965 #49036] DEBUG -- : [Redis] command=SREM args="resque:timestamps:{\"class\":\"Job\",\"args\":[],\"queue\":\"jobs\"}" "delayed:1703722645"
D, [2023-12-27T16:17:25.263009 #49036] DEBUG -- : [Redis] command=SADD args="resque:queues" ["jobs"]
D, [2023-12-27T16:17:25.263042 #49036] DEBUG -- : [Redis] command=RPUSH args="resque:queue:jobs" "{\"class\":\"Job\",\"args\":[]}"
D, [2023-12-27T16:17:25.263076 #49036] DEBUG -- : [Redis] command=LTRIM args="resque:delayed:1703722645" 100 -1
D, [2023-12-27T16:17:25.263101 #49036] DEBUG -- : [Redis] command=EXEC args=
D, [2023-12-27T16:17:25.263974 #49036] DEBUG -- : [Redis] call_time=0.82 ms
D, [2023-12-27T16:17:25.264210 #49036] DEBUG -- : [Redis] command=WATCH args="resque:delayed:1703722645"
D, [2023-12-27T16:17:25.264548 #49036] DEBUG -- : [Redis] call_time=0.30 ms
D, [2023-12-27T16:17:25.264617 #49036] DEBUG -- : [Redis] command=LLEN args="resque:delayed:1703722645"
D, [2023-12-27T16:17:25.264860 #49036] DEBUG -- : [Redis] call_time=0.22 ms
D, [2023-12-27T16:17:25.264995 #49036] DEBUG -- : [Redis] command=MULTI args=
D, [2023-12-27T16:17:25.265023 #49036] DEBUG -- : [Redis] command=DEL args="resque:delayed:1703722645"
D, [2023-12-27T16:17:25.265051 #49036] DEBUG -- : [Redis] command=ZREM args="resque:delayed_queue_schedule" 1703722645
D, [2023-12-27T16:17:25.265073 #49036] DEBUG -- : [Redis] command=EXEC args=
D, [2023-12-27T16:17:25.265405 #49036] DEBUG -- : [Redis] call_time=0.30 ms
D, [2023-12-27T16:17:25.265527 #49036] DEBUG -- : [Redis] command=ZRANGEBYSCORE args="resque:delayed_queue_schedule" "-inf" 1703722645 "LIMIT" 0 1
D, [2023-12-27T16:17:25.265754 #49036] DEBUG -- : [Redis] call_time=0.20 ms
Redis 5
$ REDIS_VERSION=5 ruby repro.rb
[call] flushdb
[call] rpush resque:delayed:1703722657 {"class":"Job","args":[],"queue":"jobs"}
[call] sadd resque:timestamps:{"class":"Job","args":[],"queue":"jobs"} delayed:1703722657
[call] zadd resque:delayed_queue_schedule 1703722657 1703722657
[call] zrangebyscore resque:delayed_queue_schedule -inf 1703722657 LIMIT 0 1
resque-scheduler: [INFO] 2023-12-27T16:17:37-08:00: Processing Delayed Items
[call] info
[call] script load if redis.call('SETNX', KEYS[1], ARGV[1]) == 1
then
  redis.call('EXPIRE', KEYS[1], 180)
  return 1
else
  return 0
end
[call] evalsha 8f4a4e422bfa8f9a0931e350d6e30b1c3ce97c33 1 resque:resque:resque_scheduler_master_lock localhost:49123
[call] lrange resque:delayed:1703722657 0 99
[call] watch resque:delayed:1703722657
[call_pipelined] sadd resque:queues jobs rpush resque:queue:jobs {"class":"Job","args":[]}
[call_pipelined] MULTI srem resque:timestamps:{"class":"Job","args":[],"queue":"jobs"} delayed:1703722657 ltrim resque:delayed:1703722657 100 -1 EXEC
[call] watch resque:delayed:1703722657
[call] llen resque:delayed:1703722657
[call_pipelined] MULTI del resque:delayed:1703722657 zrem resque:delayed_queue_schedule 1703722657 EXEC
[call] zrangebyscore resque:delayed_queue_schedule -inf 1703722657 LIMIT 0 1

Annoyingly, the log formats aren't the same between versions. But the salient point is that the Redis commands are actually executed a bit differently in Redis 4 vs Redis 5.

With Redis 4, the enqueue in the nested pipeline (from within the resque gem) gets added to the MULTI/EXEC already in progress that resque-scheduler is wrapping around each batch. Thus, the queue push happens inside of the transaction:

MULTI
SREM resque:timestamps:{"class":"Job","args":[],"queue":"jobs"} delayed:1703722645
SADD resque:queues jobs
RPUSH resque:queue:jobs {"class":"Job","args":[]}
LTRIM resque:delayed:1703722645 100 -1
EXEC

With Redis 5, the nested pipeline appears to fire eagerly, since it's effectively being called on the top-level like Resque.redis.pipelined { ... } instead of on the block argument like Resque.redis.multi { |m| m.pipelined { ... } }. Thus, the queue push happens outside of the transaction:

SADD resque:queues jobs
RPUSH resque:queue:jobs {"class":"Job","args":[]}
MULTI
SREM resque:timestamps:{"class":"Job","args":[],"queue":"jobs"} delayed:1703722657
LTRIM resque:delayed:1703722657 100 -1
EXEC

Note that the SADD + RPUSH are still sent as one unit in a pipelined command, but they're still apparently sent before the MULTI/EXEC block, meaning that the enqueue could still go through even if the schedule bookkeeping gets rolled back.

I'm guessing that this is not the desired behavior. I'm not totally sure of the implications, but it seems fairly serious. As such, I'll amend the title to reflect the broader nature of the issue.

@ajvondrak ajvondrak changed the title Still get Redis 5.0 pipelining deprecation warnings on resque-scheduler 4.9+ Redis 5 API for multi/pipelined commands changes transaction behavior in resque-scheduler 4.9+ Dec 28, 2023
@murdoch
Copy link

murdoch commented Mar 5, 2024

Is there any update on how to fix this issue? I've found that setting Redis.silence_deprecations = true does not resolve it.

PS - using the following:

  • redis (4.8.1)
  • resque-scheduler (4.10.2)
  • resque (2.6.0)

@alxckn
Copy link

alxckn commented Jul 19, 2024

We've been observing job duplication on our app (with redis-rb 5.x), very probably related to this issue. As mentioned by @ajvondrak this seems like a serious problem indeed.

Fixing this issue would require patching both resque and resque-scheduler to pass down the transaction object, so quite invasive and complex... This would not solve issues for enqueue hooks that use redis though.
Perhaps a revert of #767 should be considered?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants