-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Problem with RDY locking to zero #6
Comments
Thanks for the report Nick. This is strange as it seems gnsq thinks the connection is closed but nsqd sees the connection as open. When gnsq sees an exception while requeueing, it should close the connection and then reopen it, resetting the Thanks for hunting it down this far. I will see if I can reproduce the behaviour on my end. |
FWIW as of NSQ |
@mreiferson That's good news, it was always a bit tricky to get that correct. What is recommended for clients in terms of support for older versions of NSQ? |
Well, one approach would be to not support older |
Small bit of progress made. I'm able to consistently reproduce the behaviour you're seeing with this snippet: import logging
import gnsq
logging.basicConfig(level=logging.DEBUG)
reader = gnsq.Reader('test', 'test', nsqd_tcp_addresses=['localhost:4150'])
@reader.on_message.connect
def handle_message(reader, message):
for conn in reader.conns:
conn.stream.socket.send('badcmd\n')
raise Exception('test')
reader.start() It seems there is some unexpected behaviour when the socket is closed on the server's side. gnsq reconnects, but an initial |
Seems to have been a bug with how gnsq handles connections failures while starting to backoff. I've pushed a new version of gnsq to pypi (version 0.3.0) with a fix for this included. Let me know if this resolves the issue for you. Thanks again for reporting! |
Thank you so much for this! I won't be able to tell you immediately if this resolves our issue as it manifests itself in production pretty rarely. I will deploy 0.3.0 and see how we get on. |
This hopefully addresses the issues we've seen with RDY locking to zero (issue #2304). wtolson/gnsq#6
Thanks, I'm going to close the issue. Feel free to reopen if the issue occurs again. |
Just a note to say that we've seen the networking issues in production again, and no evidence that |
Sweet, thanks for the update. |
Hi Trevor! First of all -- thank you so much for all your hard work on gnsq. It's a lovely little library and for the most part has been absolutely flawless for us.
Unfortunately we're having an issue at the moment with gnsq v0.2.3 (and nsqd v0.3.0) in which it seems like a network blip causes a
Reader
to lock itself in a ready state of zero and never recover.We're getting the following error message in the logs:
This appears to come from line 699 of
reader.py
, and seems to me to imply that we're having a network blip while attempting a requeue. Thereafter, it seems likeRDY
never increments again. You can see the issue where we're dealing with this internally at hypothesis/h#2304.I've tried to recreate the conditions under which this might occur using blockade (see this repo), but as yet haven't succeeded.
I've perused the source code, and I'm guessing this might be the issue you fixed in d193160 (and 6905a14). Does that sound right?
If not, is there any way I can help track this issue down?
The text was updated successfully, but these errors were encountered: