Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve robustness around redis connection issues #26

Merged
merged 1 commit into from
Dec 4, 2012

Conversation

michaelcameron
Copy link
Contributor

We've had periodic issues with redis where there will be some connection issues, the workers appear to reconnect successfully, but then they fail when they get new jobs with some weird cast exception. They stay alive polling until another new job pops up, which may be much later and then have:

java.lang.ClassCastException: java.lang.Long cannot be cast to [B
        at redis.clients.jedis.Connection.getBinaryBulkReply(Connection.java:182)
        at redis.clients.jedis.Connection.getBulkReply(Connection.java:171)
        at redis.clients.jedis.Jedis.lpop(Jedis.java:1090)
        at net.greghaines.jesque.worker.WorkerImpl.poll(WorkerImpl.java:487)
        at net.greghaines.jesque.worker.WorkerImpl.run(WorkerImpl.java:230)

While trying to troubleshoot, there were a few changes I wanted to make to better find the root cause:

  1. There appears to be a code path where an exception can occur on reconnect, but the message will never be logged. In WorkerImpl.recoverFromException if there is anything but a JedisConnectionException on reconnect, then the exception will not be handled until run which only has a try/finally.
  2. The code assumes that a connected jedis object is healthy, but the JedisPool implementation in Jedis itself uses a stronger condition: jedis.isConnected() && jedis.ping().equals("PONG"). This will further test the connection with an exchange of data.
  3. I wanted to tweak the recoverFromException implementation in grails-jesque first since I already have a GrailsWorkerImpl sublcass, but I could not access some of the private variables necessary to make it work. I changed some of those to protected so I could try some more things if I need to before making another pull request.

@ghost ghost assigned gresrun Dec 4, 2012
gresrun added a commit that referenced this pull request Dec 4, 2012
Improve robustness around redis connection issues
@gresrun gresrun merged commit 063e7cb into gresrun:master Dec 4, 2012
@michaelcameron
Copy link
Contributor Author

Can you release a snapshot version based on this change and the other changes you merged in?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants