-
Notifications
You must be signed in to change notification settings - Fork 992
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
EventLoop thread blocked by EmitterProcessor.onNext(…) causes timeouts #1086
Comments
mp911de
changed the title
CommandTimeoutExceptions with Lettuce Core 5.1.7-Release
EventLoop thread blocked by EmitterProcessor.onNext(…) causes timeouts
Jul 29, 2019
mp911de
added a commit
that referenced
this issue
Jul 29, 2019
Lettuce now uses DirectProcessor as non-blocking event bus. DirectProcessor no longer blocks calling .next(…). Previously, EmitterProcessor used a blocking queue which blocked the caller if downstream consumers did not consume events in time.
mp911de
added a commit
that referenced
this issue
Jul 29, 2019
Lettuce now uses DirectProcessor as non-blocking event bus. DirectProcessor no longer blocks calling .next(…). Previously, EmitterProcessor used a blocking queue which blocked the caller if downstream consumers did not consume events in time.
That's fixed now. |
Hi, Is this issue fixed? We have similar AWS cluster set up in our environment and seeing lot of lettuce-timer and lettuce-eventExecutorLoop threads in WAITING status causing extended slowness on our redis operations. we are using RedisClusterClient to connect to our cluster in AWS. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Bug Report
Current Behavior
We recently started using Lettuce with Amazon Elasticache. We're using latest version - 5.1.7-RELEASE with the default configurations. We have a 3 shard Elasticache cluster with a Master/Slave configuration. One of our services deployed this Lettuce version in production and within about 2-3 hours started observing CommandTimeoutExceptions on Ping against the elasticache nodes, with high CPU usage.Other than Ping there's hardly any other Redis operation on the srervice nodes. There was also a CPU load spike.
We took thread dumps from healthy and unhealthy instances, and attached is the thread dump for the unhealthy node. Based on our findings, we decided to try disabling the periodic topology refresh and that seems to fix these command timeouts. We also observed the error with loading partitions.
We see all the event loop threads stuck in the TIMED_WAITING state on the EmitterProcessor flow.
Thread dump from the unhealthy instance
Environment
Possible Solution
Disabling the Periodic Topology Refresh has helped us eliminate the errors.
The text was updated successfully, but these errors were encountered: