Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hanging rake task due to lock contention when using multiple redis failover client connections in a threadpool #38

Closed
maxjustus opened this issue Oct 5, 2012 · 4 comments

Comments

@maxjustus
Copy link

Here's another one for ya :)

Today we started noticing that Rake tasks were hanging after completion. We did a thread dump on one that was hung and saw this:
https://gist.github.com/b3852a8ea44cc1cf36e1

The code that ended up fixing it was changing this:

https://gist.github.com/95316d829d24627b6937
to this:
https://gist.github.com/9d70da788596c8192572

For Sidekiq server instances we are still using a connection pool of worker_thread_count + 2 and haven't seen this same issue. I did notice though on occasion that when closing a rails console with allot of open redis failover client connections open it'll hang. Perhaps it relates to connection cleanup?

@ryanlecompte
Copy link
Owner

That's strange. I just tested this locally by creating 50
RedisFailover::Client instances in irb, and then closing the irb session.
There was no hanging at all. The RedisFailover::Client does provide a
#shutdown method which will gracefully shutdown the client (and the
underlying ZK connection) which you can try invoking, but it really
shouldn't be necessary.

@slyphon, do you notice anything obvious in the gist that's linked here?
The thread dump points to something in the threadpool.rb of the ZK gem.

I also noticed that you're using ZK 1.7. I have redis_failover locked down
to ~1.6. I haven't tested redis_failover with ZK 1.7 yet.

On Fri, Oct 5, 2012 at 1:54 PM, Max Justus Spransy <[email protected]

wrote:

Here's another one for ya :)

Today we started noticing that Rake tasks were hanging after completion.
We did a thread dump on one that was hung and saw this:
https://gist.github.com/b3852a8ea44cc1cf36e1

The code that ended up fixing it was changing this:

https://gist.github.com/95316d829d24627b6937
to this:
https://gist.github.com/9d70da788596c8192572

For Sidekiq server instances we are still using a connection pool of
worker_thread_count + 2 and haven't seen this same issue. I did notice
though on occasion that when closing a rails console with allot of open
redis failover client connections open it'll hang. Perhaps it relates to
connection cleanup?


Reply to this email directly or view it on GitHubhttps://github.com//issues/38.

@tsilen
Copy link
Contributor

tsilen commented Oct 8, 2012

This is the same issues as I reported here: zk-ruby/zk#50
The workaround I've successfully been using:

at_exit do
  $redis.shutdown
end

@ryanlecompte
Copy link
Owner

Thanks, @tsilen.

Yes, that's a good approach. You can use the RedisFailover::Client#shutdown
method in an #at_exit hook.

Ryan

On Mon, Oct 8, 2012 at 8:08 AM, tsilen [email protected] wrote:

This is the same issues as I reported here: zk-ruby/zk#50zk-ruby/zk#50
The workaround I've successfully been using:

at_exit do
$redis.shutdown
end


Reply to this email directly or view it on GitHubhttps://github.com//issues/38#issuecomment-9228841.

@ryanlecompte
Copy link
Owner

Root issue is logged here: zk-ruby/zk#50

According to that issue, it looks like a bug in the Ruby interpreter and how it handles ConditionVariable cleanup. For now, I would suggest upgrading to redis_failover 1.0 (to be released tomorrow) and use a single ZK client instance when you setup your Sidekiq threadpool, e.g.:

  zk = ZK.new('localhost:2181,localhost:2182,localhost:2183')
  cp = ConnectionPool.new(:size => 20) { RedisFailover::Client.new(:zk => zk) }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants