Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Emit Connection Error events #885

Closed
squishypenguin opened this issue Oct 12, 2018 · 9 comments
Closed

Emit Connection Error events #885

squishypenguin opened this issue Oct 12, 2018 · 9 comments
Labels
type: feature A new feature
Milestone

Comments

@squishypenguin
Copy link

squishypenguin commented Oct 12, 2018

This is a feature request, I believe. With 5.1.0 (and using spring-session-data-redis 2.0.5.RELEASE), I'll receive the ConnectionDeactivatedEvent (which isn't at all useful since there is no reason associated with the event) but nothing ever publishes to tell me that there is connection refused, or any other error from redis. We are in the situation where roughly every 90 days, our redis password will be rotated. I need to catch this situation and force a reload of the password cache. However, since I can't get that event from lettuce, I can't handle it. It looks like the watchdog isn't publishing events at all or just not at the lowest point I need.

Example of what I'll see:

tce_1  | 2018-10-12 16:26:52,513 [] [lettuce-nioEventLoop-4-4] WARN  io.lettuce.core.protocol.ConnectionWatchdog  - Cannot reconnect: io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: TRG524396.local/10.0.1.15:6379
tce_1  | 2018-10-12 16:26:52,513 [] [lettuce-nioEventLoop-4-2] WARN  io.lettuce.core.protocol.ConnectionWatchdog  - Cannot reconnect: io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: TRG524396.local/10.0.1.15:6379
tce_1  | 2018-10-12 16:26:52,515 [] [lettuce-nioEventLoop-4-1] WARN  io.lettuce.core.protocol.ConnectionWatchdog  - Cannot reconnect: io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: TRG524396.local/10.0.1.15:6379
tce_1  | 2018-10-12 16:26:59,963 [] [pool-7-thread-1] ERROR org.springframework.scheduling.support.TaskUtils$LoggingErrorHandler  - Unexpected error occurred in scheduled task.
tce_1  | org.springframework.data.redis.RedisSystemException: Redis exception; nested exception is io.lettuce.core.RedisException: Currently not connected. Commands are rejected.
tce_1  | 	at org.springframework.data.redis.connection.lettuce.LettuceExceptionConverter.convert(LettuceExceptionConverter.java:74)
tce_1  | 	at org.springframework.data.redis.connection.lettuce.LettuceExceptionConverter.convert(LettuceExceptionConverter.java:41)
tce_1  | 	at org.springframework.data.redis.PassThroughExceptionTranslationStrategy.translate(PassThroughExceptionTranslationStrategy.java:44)
tce_1  | 	at org.springframework.data.redis.FallbackExceptionTranslationStrategy.translate(FallbackExceptionTranslationStrategy.java:42)
tce_1  | 	at org.springframework.data.redis.connection.lettuce.LettuceConnection.convertLettuceAccessException(LettuceConnection.java:257)
tce_1  | 	at org.springframework.data.redis.connection.lettuce.LettuceSetCommands.convertLettuceAccessException(LettuceSetCommands.java:520)
tce_1  | 	at org.springframework.data.redis.connection.lettuce.LettuceSetCommands.sMembers(LettuceSetCommands.java:245)
tce_1  | 	at org.springframework.data.redis.connection.DefaultedRedisConnection.sMembers(DefaultedRedisConnection.java:550)
tce_1  | 	at org.springframework.data.redis.core.DefaultSetOperations.lambda$members$6(DefaultSetOperations.java:158)
tce_1  | 	at org.springframework.data.redis.core.RedisTemplate.execute(RedisTemplate.java:224)
tce_1  | 	at org.springframework.data.redis.core.RedisTemplate.execute(RedisTemplate.java:184)
tce_1  | 	at org.springframework.data.redis.core.AbstractOperations.execute(AbstractOperations.java:95)
tce_1  | 	at org.springframework.data.redis.core.DefaultSetOperations.members(DefaultSetOperations.java:158)
tce_1  | 	at org.springframework.data.redis.core.DefaultBoundSetOperations.members(DefaultBoundSetOperations.java:152)
tce_1  | 	at org.springframework.session.data.redis.RedisSessionExpirationPolicy.cleanExpiredSessions(RedisSessionExpirationPolicy.java:132)
@mp911de mp911de changed the title Event not published on connection refused or auth error Emit Connection Error events Oct 12, 2018
@mp911de mp911de added the type: feature A new feature label Oct 12, 2018
@mp911de
Copy link
Collaborator

mp911de commented Oct 12, 2018

That's an interesting use case. I turned this ticket into an enhancement to emit an event whenever a connection operation fails (initial connect, reconnect). ConnectionDeactivatedEvent is a signal that is triggered after a transport channel was disconnected and after pending commands were moved to the retry queue.

@squishypenguin
Copy link
Author

When I first implemented the consumption of events, I had thought the ConnectionDeactivatedEvent would give me a reason as to why the disconnect happens. It seems like the that information should be available so maybe that's another enhancement, add a reason to the events, if there is one?

@mp911de
Copy link
Collaborator

mp911de commented Oct 12, 2018

There's not always a reason associated with a disconnect, or at least, things are not directly related to each other. Typically, an IOException occurs on either reading from or writing to the channel. We can receive exceptions while writing write through the write future. This exception is propagated to the code that invoked a particular Redis command.

When reading, we potentially can get hold of an error notification via exceptionCaught(…). After that, we get a notification about the disconnect. Thins run asynchronously and we can mostly guess about the sequence of events. That's different compared to blocking I/O where we can directly get hold of I/O exceptions.

@mp911de
Copy link
Collaborator

mp911de commented Oct 12, 2018

PS: You can register in any case an own ChannelInboundHandler and register it with netty's pipeline to participate in what happens in the I/O layer. This would allow your code to correlate events if that works for you. Check out ClientResources and NettyCustomizer (both are also accessible when using Spring Data Redis).

@squishypenguin
Copy link
Author

I forgot to add that after the redis password is reset and a restart since redis requires that, when lettuce reconnects, instead of getting the "ERR invalid password" exception as I'd expect, I get the "NOAUTH Authentication required". So not only am I not able to detect the connection error, but when the connection is retried (I have the auto reconnect on), the error isn't the one I should be getting, since I should actually get invalid password in this case because I haven't triggered a refresh of password cache. I just tried again with 5.1.1 to make sure this wasn't already addressed.

@mp911de
Copy link
Collaborator

mp911de commented Nov 13, 2018

What wanders on my mind right now is introducing a new event (ReconnectFailedEvent). Reiterating your requirement is to catch reconnect problems. A simple disconnect with a successful reconnect isn't a problem, but persistent reconnect issues is something you want to catch?

ConnectionWatchdog initiates reconnect and there we can get hold of potential I/O exceptions and protocol failures such as an invalid password.

@mp911de mp911de added this to the Backlog milestone Nov 13, 2018
@squishypenguin
Copy link
Author

Yes, a persistent connection issue is what we're looking to detect and so that new event would give us that information.

@mp911de mp911de modified the milestones: Backlog, 5.2.0 Nov 24, 2018
mp911de added a commit that referenced this issue Nov 24, 2018
Lettuce now emits ReconnectFailedEvent if a reconnect attempt fails along the cause and the retry counter. This event allows detection of persistent reconnect failures.
@mp911de
Copy link
Collaborator

mp911de commented Nov 24, 2018

ReconnectFailedEvent is in place now. Care to give 5.2.0.BUILD-SNAPSHOT a spin?

@mp911de mp911de closed this as completed Nov 24, 2018
@squishypenguin
Copy link
Author

Looks good, I was able to see the new ReconnectFailedEvent when I kept my redis down and then it transferred to a ConnectionActivatedEvent when I started redis back up. All as I would expect.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: feature A new feature
Projects
None yet
Development

No branches or pull requests

2 participants