-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FireAndForget, Timeouts] Redis won't timeout requests if there is connection issue and FireAnfForget request is queued #2392
Comments
This seems odd; I agree that this isn't intentional but a quick glance at the code doesn't show any obvious smoking guns - the timeout detection code doesn't check for F+F, it just checks the write time - and the write time is written regardless. Will need to look. |
I've forgot to mention it hangs on .NET6 and .NET7. Any luck identifying issue behind ? |
@mgravell have you looked at this issue ? It seems serious problem if someone relay on F+F mode and timeouts. For F+F request There still is issue with this |
Not sure if that is any option :P but removing |
Introduced within: 7dda23a#diff-c64610826746e4cc2aeb0edf12469d2ea64583486a9246f7493d197bc33c6af1R856 Perhaps you need better way to determine completed messages. |
Briefly, as per my previous comment - when I looked at the other place timeout can happen, in the sent queue. Ultimately, this isn't my day job, and not everything can be done immediately. I do agree with your analysis about the pending queue and the |
I would assume that if someone is explicitly using F+F option he doesn't care about being informed whether operation succeeded or failed and is accepting the "best effort" nature of this operation like in pattern below:
Probably the the worst thing we can do here is hang in operation which is assumed to be "best effort" anyway. In this context it make sense to remove F+F messages from the pending queue as soon as they are timed out.
This solution can result in situation when operations are executed on Redis way past their timeout which kinda breaks the contract with developer who specified the timeout. This can potentially result in consistent data (set by microservice which doesn't have connectivity issues) being overridden by stale data provided by some instance which had connectivity issues. Of course we may argute that such thing can happen regardless but having mechanism that resets the messages timeout definitely increases the risk. |
I also agree that when using F+F you aim for performance and not guarantees of delivering / success. |
I agree that re-enqueueing them would have side effects, especially taking into accoun that the timeout set by the dev is already reached so he may have compensating actions, such as read again from the source of truth and cache it, so it will lead to some discrepancy. To not break the contract of the timeout, as soon as requests are timed out as per contract they may be dropped. |
…og and avoid sending messages after the sync wait timed out.
…og when not connected, even when no completion is needed, to be able to dequeue and complete other timed out messages.
…og when not connected, even when no completion is needed, to be able to dequeue and complete other timed out messages.
Hi, recently we observed that when Redis is not accessible to our microservices than they stop responding to requests. This was quite a shock as we always relayed on Redis as on service that is not guaranteed. Therefore we strongly relay on short timeouts and in case Redis is not available we always upstream requests to source of truth.
However some time ago we started to use FireAndForget flag for requests that did not require to complete (best effort). This was ideal when invalidating cache and similar use cases.
It seems that when we queue such request in time when there is network partition to Redis instance or some fail over procedure is in progress than all subsequent request to Redis on every thread using same connection multiplexer hangs for ever (until Redis instance is back and accessible).
Check this simple test:
We are on latest version of
StackExchange.Redis
assembly2.6.96
.Is it right approach to use FireAndForget mode to speed such invalidations/updates when we hope it will succeed but we do not want to block parent request waiting for Redis response? Like I said best effort to set something in cache in background.
Is it expected that it disrupt timeouts for every subsequent request?
The text was updated successfully, but these errors were encountered: