-
Notifications
You must be signed in to change notification settings - Fork 986
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cluster topology lookup should not replaces self-node details with host and port from RedisURI when RedisURI is load balancer #712
Comments
Setting the own ( I'd need to investigate more in-depth whether we still require this behavior. Did you try removing the line and run the tests? |
You can check the CI results after remove that line. If removing this breaks other features, what do you think introducing a field to indicate one node is actually load balancer? Then we can skip replacing URI. |
Sorry, that's not how open source contributions work. If you change/break something, it's your responsibility to find out that something broke and report it rather than employ the project maintainer to monitor PR/CI builds. However, maybe we can introduce node aliasing for a smoother resolution instead of having a single URI. |
Sorry that I haven't made myself clear. I put the broken test results here to help you understand the effects of just removing this since you said that you would investigate whether we need this feature. That doesn't mean that's the final pull request I want to make. Before I draft the final fix, I want to hear from you about investigation result. Then I can make sure that I go in the right direction. Thanks. |
Alright, I see. I think introducing an optional node alias (or aliases) might be the way to go. The tests which broke perform a lookup with the RedisURI they used to connect to the node. Redis reports within the tests A lookup by host/port would consider the alias as well and we share the view of the Redis Cluster as well as from the seed-node perspective. |
I suppose you mean something like this: class RedisClusterNode{
private RedisURI uri; // for ip and port got from redis
private Set<RedisURI> aliases; // for public domain host and port
} Then in node topology views of type But I don't think this could distinguish a normal node from load balance node, i suggest to add one bool to indicate whether a RedisURI is load balancer, then we skip replacing self. What do you think? |
Please take a look at my PR. Automatic triggered CI failed some tests, but when I tried to rebuild it, it passed. I'm confused with build difference between pr and push. |
TravisCI sometimes runs into issues. The changes you introduced are not related to adding an alias/aliases within the nodes of a partition but rather you're changing the URI. |
Yes, please check my below comment. I think introducing alias won't solve this issue. But I could help do this if this is necessary for Docker/NATted use |
Lettuce now provides a cluster node alias to better cope with load balancers or alternate cluster names. We are using the Redis reported hostname as URI (was previously the seed node URI) and add the seed node URI as alias. This way we make sure to connect the appropriate host directly without connecting the balancer and receiving redirects. Redirects can happen in an environment where the load balancer forwards connections to random cluster master nodes.
Lettuce now provides a cluster node alias to better cope with load balancers or alternate cluster names. We are using the Redis reported hostname as URI (was previously the seed node URI) and add the seed node URI as alias. This way we make sure to connect the appropriate host directly without connecting the balancer and receiving redirects. Redirects can happen in an environment where the load balancer forwards connections to random cluster master nodes.
That's in place now, snapshot builds are available. Care to give it a spin? |
Worked perfectly. Thanks. |
Today it's common that managed Redis Cluster expose a load balancer as main connection point that randomly forward requests to real Redis node. Popular ones are Azure Redis Cache and AWS ElasticCache. When Lettuce load topology view by calling
CLUSTER NODES
, the node with tagSELF
will be replaced with load balancer URL.Lettuce Version: 4.4.0.Final
This leads to 2 problems:
Load Balancer is considered as a normal node with slots assigned, but it forward requests randomly. So we will get randomly 'MOVED' response when request is forwarded into certain node.
When we try to connect the node being replaced and cluster node membership verification enabled(default), RedisException with message
This connection point is not known in the cluster view
will be thrown.The symptom cannot be stably reproduced since we rely on
PartitionsConsensus.HEALTHY_MAJORITY
to randomly return topology view. Only the view containing replaced self node the issue could be reproduced.I believe this is same as #312, but the fix is not working. The root cause is inside
NodeTopologyView# Constructor()
@mp911de I don't think simply removing this line is a good fix. I want to understand the motivation behind replacing ip and port with RedisURL. After that, I'm happy to provide a pull request to fix this.
The text was updated successfully, but these errors were encountered: