-
Notifications
You must be signed in to change notification settings - Fork 129
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
reader: max_in_flight check in _send_rdy() doesn't take in-flight msgs into account #177
Comments
The same issue affects the Go client. An appropriate fix for the Python client would be:
And analagously for the Go client. |
@alpaker thanks for another detailed bug report 💯 I would expect I'm curious if removing that line addresses the issue and doesn't break any other tests? |
@mreiferson For consistency you'll also want to remove https://github.com/nsqio/pynsq/blob/master/nsq/async.py#L489. I don't think the change you suggest is enough on its own. I was a little cryptic with the above diff and didn't describe the way this problem interacts with the The second issue is the need for redistribution of
If you do want to guarantee progress on all connections then you need to redistribute |
Yep.
To your point, if we fixed the inconsistency w/r/t maintaining
It isn't going to be possible to reasonably prevent this race condition entirely, nor do I think it's worth it. The changes we're talking about here should improve behavior without any tradeoffs, so I'm 👍. |
redistribution logic accordingly. This brings reader behavior into agreement with nsqd behavior (compare nsqio/nsq#404) and removes an opportunity for max_in_flight violations (nsqio#177).
redistribution logic accordingly. This brings reader behavior into agreement with nsqd behavior (compare nsqio/nsq#404) and removes an opportunity for max_in_flight violations (nsqio#177).
redistribution logic accordingly. This brings reader behavior into agreement with nsqd behavior (compare nsqio/nsq#404) and removes an opportunity for max_in_flight violations (nsqio#177).
redistribution logic accordingly. This brings reader behavior into agreement with nsqd behavior (compare nsqio/nsq#404) and removes an opportunity for max_in_flight violations (nsqio#177).
redistribution logic accordingly. This brings reader behavior into agreement with nsqd behavior (compare nsqio/nsq#404) and removes an opportunity for max_in_flight violations (nsqio#177).
redistribution logic accordingly. This brings reader behavior into agreement with nsqd behavior (compare nsqio/nsq#404) and removes an opportunity for max_in_flight violations (nsqio#177).
redistribution logic accordingly. This brings reader behavior into agreement with nsqd behavior (compare nsqio/nsq#404) and removes an opportunity for max_in_flight violations (nsqio#177).
redistribution logic accordingly. This brings reader behavior into agreement with nsqd behavior (compare nsqio/nsq#404) and removes an opportunity for max_in_flight violations (nsqio#177).
redistribution logic accordingly. This brings reader behavior into agreement with nsqd behavior (compare nsqio/nsq#404) and removes an opportunity for max_in_flight violations (nsqio#177).
redistribution logic accordingly. This brings reader behavior into agreement with nsqd behavior (compare nsqio/nsq#404) and removes an opportunity for max_in_flight violations (nsqio#177).
redistribution logic accordingly. This brings reader behavior into agreement with nsqd behavior (compare nsqio/nsq#404) and removes an opportunity for max_in_flight violations (nsqio#177).
Fixed in #179 |
Reader._send_rdy()
contains logic to avoid setting aRDY
count on a connnection that would allow exceeding the reader'smax_in_flight
. The code in question is:The condition
(self.total_rdy + value) > self.max_in_flight
doesn't take into account messages that are currently in flight. This creates an opportunity formax_in_flight
violations. The recipe in this report will produce one, usually within a minute.I reported this offline to @jonmorehouse, who pointed out that it was surprising the issue hasn't surfaced before. I think the reason is that it can only hit when
max_in_flight
is less than the connection count, and this use case must be rare in practice:_send_rdy()
is only called with avalue
parameter of 0, 1 or_connection_max_in_flight()
.len(conns) * _connection_max_in_flight() <= max_in_flight
, then we'll never violate the in-flight constraint, because we're implicitly restricting ourselves to a per-connection limit that guarantees safety.len(conns) > max_in_flight
and_connection_max_in_flight() == 1
, then respecting the per-connection limit isn't enough. In this case deciding whether a non-zeroRDY
count is permissible on a connection requires taking into account the in-flight count on other connections.The text was updated successfully, but these errors were encountered: