Consider topology updates for default Cluster connections #1317

be-hase · 2020-06-22T14:26:38Z

Bug Report

※ Similar to the issue I shared before. #1245

Current Behavior & Input Code

The other day, the redis cluster node(VM) became unable to return a response due to a hypervisor failure, and a timeout error began to occur. I expected lettuce to do the failover. However, failover had failed. Finally, when the VM of the redis cluster node shut down completely, failover succeeded.

The redis cluster is used exclusively for the purpose of pubsub.

Just like the issue I shared before(#1245), I was able to reproduce it with code like this:

package com.example.demo;

import java.time.Duration;
import java.util.List;

import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RestController;

import io.lettuce.core.RedisURI;
import io.lettuce.core.TimeoutOptions;
import io.lettuce.core.cluster.ClusterClientOptions;
import io.lettuce.core.cluster.ClusterTopologyRefreshOptions;
import io.lettuce.core.cluster.RedisClusterClient;
import io.lettuce.core.cluster.pubsub.StatefulRedisClusterPubSubConnection;
import io.lettuce.core.cluster.pubsub.api.sync.RedisClusterPubSubCommands;

@SpringBootApplication
public class DemoApplication {

    public static void main(String[] args) {
        SpringApplication.run(DemoApplication.class, args);
    }

    @Configuration
    public static class RedisConfiguration {
        private static final Duration TIMEOUT = Duration.ofSeconds(5);

        @Bean(destroyMethod = "shutdown")
        public RedisClusterClient redisClusterClient() {
            final var topologyRefreshOptions =
                    ClusterTopologyRefreshOptions.builder()
                                                 .enablePeriodicRefresh(Duration.ofSeconds(30))
                                                 .enableAllAdaptiveRefreshTriggers()
                                                 .build();

            final var clusterClientOptions =
                    ClusterClientOptions.builder()
                                        .pingBeforeActivateConnection(true)
                                        .validateClusterNodeMembership(false)
                                        .topologyRefreshOptions(topologyRefreshOptions)
                                        .timeoutOptions(TimeoutOptions.enabled(TIMEOUT))
                                        .build();

            final var redisURIs = List.of(
                    RedisURI.builder()
                            ...
                            .build(),
                    RedisURI.builder()
                            ...
                            .build(),
                    RedisURI.builder()
                            ...
                            .build()
            );

            final var redisClusterClient = RedisClusterClient.create(redisURIs);
            redisClusterClient.setOptions(clusterClientOptions);
            redisClusterClient.setDefaultTimeout(TIMEOUT);
            return redisClusterClient;
        }

        @Bean(destroyMethod = "close")
        public StatefulRedisClusterPubSubConnection<String, String> redisClusterConnection(
                RedisClusterClient redisClusterClient
        ) {
            return redisClusterClient.connectPubSub();
        }

        @Bean
        public RedisClusterPubSubCommands<String, String> redisClusterCommands(
                StatefulRedisClusterPubSubConnection<String, String> redisClusterConnection
        ) {
            return redisClusterConnection.sync();
        }
    }

    @RestController
    public static class SampleController {
        private final RedisClusterPubSubCommands<String, String> redisClusterCommands;

        public SampleController(RedisClusterPubSubCommands<String, String> redisClusterCommands) {
            this.redisClusterCommands = redisClusterCommands;
        }

        @GetMapping("/publish")
        public Object publish() {
            return redisClusterCommands.publish("channel", "message:" + System.currentTimeMillis());
        }
    }
}

redis cluster client manages the nodes that connect using the cluster nodes command. Therefore, it was difficult to use toxiproxy like before. As a result, I reproduced the situation where we used a heavy lua script to time out.

local tempKey = "temp-key"

redis.call("SET", tempKey, "1")
redis.call("PEXPIRE", tempKey, 300000)

while true do
    local pttl = redis.call("PTTL", tempKey)
    if pttl == 0 then
        break;
    end
end

In this situation, there is evidence that toporogy refresh is being performed.

2020-06-22 23:00:00.638  WARN 96843 --- [ecutorLoop-1-10] i.l.c.c.topology.ClusterTopologyRefresh  : Unable to connect to [10.231.171.186:7000]: Cannot initialize channel (PING before activate). Command timed out after 1 minute(s)

However, the publish command keeps giving me a timeout error. failover is not done.

io.lettuce.core.RedisCommandTimeoutException: Cannot initialize channel (PING before activate). Command timed out after 1 minute(s)
	at io.lettuce.core.ExceptionFactory.createTimeoutException(ExceptionFactory.java:63) ~[lettuce-core-5.2.2.RELEASE.jar:5.2.2.RELEASE]
	at io.lettuce.core.PlainChannelInitializer.lambda$pingBeforeActivate$1(PlainChannelInitializer.java:141) ~[lettuce-core-5.2.2.RELEASE.jar:5.2.2.RELEASE]
	at io.netty.util.concurrent.PromiseTask.runTask(PromiseTask.java:98) ~[netty-common-4.1.50.Final.jar:4.1.50.Final]
	at io.netty.util.concurrent.PromiseTask.run(PromiseTask.java:106) ~[netty-common-4.1.50.Final.jar:4.1.50.Final]
	at io.netty.util.concurrent.DefaultEventExecutor.run(DefaultEventExecutor.java:66) ~[netty-common-4.1.50.Final.jar:4.1.50.Final]
	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) ~[netty-common-4.1.50.Final.jar:4.1.50.Final]
	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) ~[netty-common-4.1.50.Final.jar:4.1.50.Final]
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) ~[netty-common-4.1.50.Final.jar:4.1.50.Final]
	at java.base/java.lang.Thread.run(Thread.java:834) ~[na:na]

Expected behavior/code

Proper failover is performed even in such a situation.
Or please tell me a good workaround. pingBeforeActivateConnection had no effect.

...By the way, if I execute the publish command using StatefulRedisClusterConnection instead of StatefulRedisClusterPubSubConnection, the failover is done properly.
Is it wrong to publish using StatefulRedisClusterPubSubConnection? If so, I'm sorry.

Environment

Lettuce version(s): 5.2.2.RELEASE
Redis version: 4.0.10

Possible Solution

...

Additional context

...

The text was updated successfully, but these errors were encountered:

mp911de · 2020-06-23T09:15:01Z

Let me rephrase your report to see whether I understood it correctly. With no failover you mean that sending PUBLISH commands using the Pub/Sub connection ran into timeouts because the remote peer didn't respond in time.

There should be no difference between StatefulRedisClusterConnection and StatefulRedisClusterPubSubConnection regarding connection infrastructure, however the command routing between StatefulRedisClusterConnection and StatefulRedisClusterPubSubConnection is different. The regular cluster connection uses the channel name as routing key and sends the PUBLISH message towards the cluster node that serves the key/slot hash. While the channel name isn't exactly a key, this is the cause why you see a different behavior.

The Cluster Pub/Sub connection does not consider key routing at all and therefore it sends PUBLISH commands to the default connection.

When establishing cluster connections, we attempt to connect to one of the bootstrap nodes configured through RedisURI. RedisClusterClient accepts multiple bootstrap nodes to obtain the initial cluster topology view. The client remains connected to the first successfully connected bootstrap node. This connection is used to send key-less commands (such as INFO). To understand why no failover happened, we need to dig a bit further.

The client refreshes its topology view according to the configuration. Once this happens, any connections to nodes that are no longer part of the cluster are closed, independent from whether the node connection was healthy or not. So in case of key routing, the client might be reconfigured with a new target server which is then used for subsequent commands.

The default connection remains always active until the client sees a connection reset of FIN packet. Since the hypervisor crashed and the connection did not see a connection termination, the client kept sending packets but without facing a response.

So in terms of resiliency, the default connection is a bottleneck as it does not participate in topology refresh until the connection sees a disconnect.

I'm not sure we can really do something about it. While we can enable key routing for PUBLISH, we need also to consider all other cases. An idea could be to use a random node for key-less commands. For subscriptions, we need to stick to a single node when subscribing to also be able to unsubscribe and to fail-over and re-subscribe.

Happy to discuss ideas.

be-hase · 2020-06-23T10:26:07Z

Let me rephrase your report to see whether I understood it correctly. With no failover you mean that sending PUBLISH commands using the Pub/Sub connection ran into timeouts because the remote peer didn't respond in time.

Yes.
Even if the cluster topology is refreshed, the connection to the crashed node remains active, and a timeout error has occurred for a long time.

however the command routing between StatefulRedisClusterConnection and StatefulRedisClusterPubSubConnection is different

So in terms of resiliency, the default connection is a bottleneck as it does not participate in topology refresh until the connection sees a disconnect.

After I reported issue, I thoroughly read the code of lettuce and understood about this part.
(Thank you for explaining. Furthermore, my understanding deepened.)

Huum.
In my opinion, I thought it would be good to use StatefulRedisClusterConnection for PUBSLISH command.
(Of course, it might be a good idea to use a random node in StatefulRedisClusterPubSubConnection, but there seems to be more consideration.)

But I'm having trouble with SUBSCRIBE.
How about choosing the following options?

When there is a connection to a node that is FAIL when refreshing the cluster topology:

re-subscribe the connection by lettuce
lettuce throws an exception from that connection. (If necessary, the user can catch the exception and re-subscribe.)

mp911de · 2020-06-23T10:46:23Z

Right now, the default connection isn't aware of its nodeId. I'm considering to change that so that the default connection knows which cluster node it is connected to. Once the cluster topology is refreshed and nodes get updated, the default connection checks if it is still part of the topology. If not, then we would issue a reconnect and the default connection would connect to a random node.
That way we always guarantee there is a node to subscribe to and with the default connection handle we have a place that allows re-subscription upon disconnect.

be-hase · 2020-06-23T11:15:40Z

I'm considering to change that so that the default connection knows which cluster node it is connected to

Oh, sounds good. I hope it works. I hope it helps other people who have similar problems.

I was also doing a test by shutting down the redis node, but I could not test the situation because redis could not return a response.
It was an interesting opportunity to get to know the internal implementation of lettuce.

… of the cluster #1317 If the default cluster connection points to a node that is no longer part of the cluster, then the connection is reset to point to a cluster member again. Cluster connection facades therefore are aware of their node Id and once the Partitions get updated, the facade verifies cluster membership. The check isn't considering failure flags, only cluster membership. The connection reset is tied to ClusterClientOptions.isCloseStaleConnections which can be disabled on demand.

be-hase added the type: bug A general bug label Jun 22, 2020

mp911de changed the title ~~Cannot failover when timeout error occurs for redis pubsub(StatefulRedisClusterPubSubConnection)~~ Consider topology updates for default Cluster connections Jun 24, 2020

mp911de added target: Lettuce 6 type: feature A new feature and removed type: bug A general bug labels Jun 24, 2020

mp911de added this to the 6.0 RC1 milestone Jun 24, 2020

mp911de closed this as completed Jul 30, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consider topology updates for default Cluster connections #1317

Consider topology updates for default Cluster connections #1317

be-hase commented Jun 22, 2020

mp911de commented Jun 23, 2020

be-hase commented Jun 23, 2020

mp911de commented Jun 23, 2020

be-hase commented Jun 23, 2020

Consider topology updates for default Cluster connections #1317

Consider topology updates for default Cluster connections #1317

Comments

be-hase commented Jun 22, 2020

Bug Report

Current Behavior & Input Code

Expected behavior/code

Environment

Possible Solution

Additional context

mp911de commented Jun 23, 2020

be-hase commented Jun 23, 2020

mp911de commented Jun 23, 2020

be-hase commented Jun 23, 2020