-
Notifications
You must be signed in to change notification settings - Fork 986
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lettuce not able to reconnect automatically to SSL+authenticated ElastiCache node #1201
Comments
Thanks for the excellent report, it contains all details to trace down the issue. The connected server behaves weirdly. During the reconnect, we receive the following response Below are both log events that indicate what's going on. Please ask your AWS/Redis team to fix the issue on the Redis side. This issue is hard to trace especially if you didn't had debug logging enabled. It makes sense to log command failures that result out of reconnection otherwise, these issues aren't visible. {"ts":"2020-01-05T22:57:25.036+08:00","level":"TRACE","thread":"lettuce-epollEventLoop-4-2","logger":"io.lettuce.core.protocol.CommandHandler","message":"[channel=0x3f9c9f7b, /172.23.8.187:41932 -> master.redis.oiwzdu.apse1.cache.amazonaws.com/172.23.11.219:6379, chid=0x2] Buffer: -ERR Client sent AUTH, but no password is set","throwable":{}}
{"ts":"2020-01-05T22:57:30.658+08:00","level":"TRACE","thread":"lettuce-epollEventLoop-4-2","logger":"io.lettuce.core.protocol.CommandHandler","message":"[channel=0x3f9c9f7b, /172.23.8.187:41932 -> master.redis.oiwzdu.apse1.cache.amazonaws.com/172.23.11.219:6379, chid=0x2] Buffer: -NOAUTH Authentication required.","throwable":{}} |
Lettuce now logs failures of asynchronously fired commands during connection activation. Previously, failures during e.g. reconnect went unnoticed and could result in a subsequent NOAUTH Authentication required errors although the password was provided.
Lettuce now logs failures of asynchronously fired commands during connection activation. Previously, failures during e.g. reconnect went unnoticed and could result in a subsequent NOAUTH Authentication required errors although the password was provided.
Logging on AUTH failure (and a few other commands) during reconnect is now in place. |
Lettuce now logs failures of asynchronously fired commands during connection activation.
Hi @mp911de - thanks for looking at this, and improving the logging. I will try and raise this with AWS Support. I suspect it relates to some kind of race condition where the AWS side starts the server and accepts connections before it has fully enabled/enforced the authentication token, leading to the messed up state. I certain understand your perspective, however a difficulty here from a user perspective is that we have essentially instructed the driver to authenticate using a token/password, yet the driver essentially overrides the user request based on the confusing response from the server, rather than treating the connection as invalid and then trying to reconnect (with authentication enabled) and honouring the user configuration. This leaves us in a bit of a difficult position in terms of workarounds to force a reconnect - is there any way you would suggest is best to deal with this in the short-term? |
Thanks for the additional details. Lettuce does not has the notion of an invalid connection when reconnecting by default. Instead, you can enable If the activation PING fails, Lettuce disconnects and re-attempts connecting. Other than that, we have no means to catch command failures during command activation as we assume that activation commands (of which Can you check whether enabling PING on connection activation can help in your setup? |
Thanks a lot for the suggestion! I have tried that out and it seems to workaround the problem which is great. The log for this working is at https://gist.github.com/chadlwilson/bcbf2f13964dd31e0117b16f9de6f073 @Bean
@ConditionalOnProperty("spring.redis.ping-before-activate-connection")
@SuppressWarnings("deprecation") // Suggested in https://github.com/lettuce-io/lettuce-core/issues/1201
public LettuceClientConfigurationBuilderCustomizer pingBuilderCustomizer() {
log.info("Redis Connection ping-before-activate enabled");
return builder -> builder.clientOptions(
// Default ClientOptions values here were propagated from LettuceClientConfigurationBuilder since
// it doesn't really allow the ClientOptions builder itself to be injected.
ClientOptions
.builder()
.timeoutOptions(TimeoutOptions.enabled())
.pingBeforeActivateConnection(true)
.build()
);
} In the meantime, AWS Support have got back to me and said that they can replicate the issue and have raised to the ElastiCache team for further discussion/investigation. |
So what do you think about the value of un-deprecating this command given this information? :-) |
Yeah, makes sense. I filed a ticket for it. |
pingBeforeActivateConnection turns out to be useful when using RESP2 in cases where the connection authentication should be delayed. Another case is when the PING should be used as auth PING to test whether a reconnect was successful. Related ticket: #1201.
pingBeforeActivateConnection turns out to be useful when using RESP2 in cases where the connection authentication should be delayed. Another case is when the PING should be used as auth PING to test whether a reconnect was successful. Related ticket: #1201.
pingBeforeActivateConnection turns out to be useful when using RESP2 in cases where the connection authentication should be delayed. Another case is when the PING should be used as auth PING to test whether a reconnect was successful. Related ticket: #1201.
Bug Report
I raised this initially at spring-projects/spring-boot#19436 however I have managed to get trace logs now, and it seems more suitable to raise here.
Current Behavior
When our AWS ElastiCache primary Redis node is restarted, Lettuce's automatic reconnection doesn't seem to leave connections in an authenticated, usable state. We just get repeated
NOAUTH Authentication Required
errors on the connections which don't appear to be recoverable.It seems I can replicate this reliably by restarting the master node from AWS Console, but have not been able to replicate it with a local test, local Redis within Docker Compose.
Relevant snippet is below - full logs with trace information at https://gist.github.com/chadlwilson/a35bd624775c278dc4bdfe7d2347b8c5
Then repeatedly logs the below on attempts to use the connection -
Expected behavior/code
I would expect the reconnect to be handled cleanly and be able to authenticate properly.
From looking at the logs, it seems that during reconnection, Lettuce receives a response from Redis/AWS ElastiCache that indicates it should no longer send the password/auth token for future attempts.
Environment
5.2.1.RELEASE
5.0.5
(AWS ElasticCache)Being used within a Spring Boot/Spring Data Redis project
2.2.2.RELEASE
2.2.3.RELEASE
5.2.2.RELEASE
Settings
Possible Solution
None known.
The text was updated successfully, but these errors were encountered: