Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(perf): improve delete_from_sorted_set speed #837

Closed
wants to merge 2 commits into from

Conversation

mhenrixon
Copy link
Owner

@mhenrixon mhenrixon commented Feb 17, 2024

Closes #835

@mhenrixon mhenrixon self-assigned this Feb 17, 2024
@mhenrixon mhenrixon force-pushed the fix/perf-delete_from_sorted_set branch 3 times, most recently from fa49dbd to d9c0115 Compare February 17, 2024 06:49
@ezekg
Copy link
Contributor

ezekg commented Feb 17, 2024

For delete_from_sorted_set, using ZSCAN still has the same time complexity of O(N) as the previous usage of ZRANGE (i.e. at worst case, when adding to the end of the queue — a typical workload when scheduling jobs into the future — it iterates the entire queue), which is where the problem lies. If the sorted set were, for example, 1,000,000 items instead of 100,000, the performance test would fail even worse. The entire purpose of my PR was to use ZRANGE BYSCORE to limit that search space so that we never iterate the entire queue, because iterating the entire queue won't ever scale. Right now, your solution to use ZSCAN still has the same fundamental problem: it doesn't scale; you just pushed the work to iterate the entire queue onto Redis.

Also, your tests are passing because you aren't running performance tests in CI. It fails locally for me:

bin/rspec --require spec_helper --tag perf --fail-fast spec/performance/unique_job_on_conflict_replace_spec.rb
# => Run options: include {:focus=>true, :perf=>true}
# 
#    Randomized with seed 34288
#    
#    UniqueJobOnConflictReplace
#      when schedule queue is large
#        locks and replaces quickly (FAILED - 1)
#    
#    Failures:
#    
#      1) UniqueJobOnConflictReplace when schedule queue is large locks and replaces quickly
#         Failure/Error:
#           expect do
#             Timeout.timeout(0.1) do
#               described_class.perform_in(2_592_000, 100_000, { "type" => "extremely unique" })
#             end
#           end.not_to raise_error
#    
#           expected no Exception, got #<Timeout::Error: execution expired> with backtrace:
#             # /home/zeke/.rvm/gems/ruby-3.2.2/gems/redis-client-0.20.0/lib/redis_client/ruby_connection/buffered_io.rb:139:in `wait_readable'

@ezekg
Copy link
Contributor

ezekg commented Feb 17, 2024

More thoughts: for the worst case — where the string we're searching for doesn't exist in the set — ZSCAN performs better, since it's non-blocking vs an iterative ZRANGE over the set. But when the string does exist — and we have its score — using ZRANGE BYSCORE will be orders of magitude faster because it divides the search space.

So, I guess I have 2 major questions that need answering to fully understand the problem:

  1. Do we know if the digest (a) MAY exist or that it (b) WILL exist in the sorted set?
  2. Is the score you mentioned already storing always accurate?

If the answer to 1 is (b) it MAY exist, then maybe we can try to find the item by score using ZRANGE BYSCORE at first, but fall back to ZSCAN for the worst case where the ZRANGE BYSCORE returned no results (indicating our score is stale, or the digest doesn't exist). If the answer to 1 is (a) it WILL exist, then ZRANGE BYSCORE will always be faster, given 2.

@mhenrixon mhenrixon force-pushed the fix/perf-delete_from_sorted_set branch from d9c0115 to a25420b Compare February 18, 2024 07:06
@mhenrixon mhenrixon closed this Feb 21, 2024
@mhenrixon mhenrixon deleted the fix/perf-delete_from_sorted_set branch February 21, 2024 18:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants