-
Notifications
You must be signed in to change notification settings - Fork 7.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ZOOKEEPER-3758: Leader reachability check fails with single address #1288
Conversation
…er Quorum address Since ZooKeeper 3.6.0 we can specify multiple addresses for each ZooKeeper server instance (this can increase availability when multiple physical network interfaces can be used parallel in the cluster). ZooKeeper will perform ICMP ECHO requests or try to establish a TCP connection on port 7 (Echo) of the destination host in order to find the reachable addresses. This should happen only if the user provide multiple addresses in the configuration, in a single addess is used then ZooKeeper shouldn’t send any ICMP requests. This works as we expected for the leader election connections, but in this Jira issue we found a bug when the reachability check was performed even with a single address when the Follower tries to connect to the newly elected Leader. The fix follows the same approach we discussed for the election protocol before (see ZOOKEEPER-3698). We avoid the reachability check for single addresses. Also when we have multiple addresses and none of them can be reached, we still start to connect to all addresses in parallel.
@@ -303,7 +303,9 @@ protected void connectToLeader(MultipleAddresses multiAddr, String hostname) thr | |||
this.leaderAddr = multiAddr; | |||
Set<InetSocketAddress> addresses; | |||
if (self.isMultiAddressReachabilityCheckEnabled()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this disabled by default?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, unfortunately not. There are two separate parameters:
- multiAddress.enabled : this is for enabling/disabling the whole multiaddress feature, this is set to false to default
- multiAddress.reachabilityCheckEnabled : this one can be used to disable the ICMP messages when multiple addresses are used. But this is enabled by default, as disabling it actually makes the multiAddress recovery unreliable (as ZK will not really be able to select among the addresses)
There is already a logic in the MultiAddress.getReachableOrOne() method which will skip the ICMP check if there is only a single address is provided:
zookeeper/zookeeper-server/src/main/java/org/apache/zookeeper/server/quorum/MultipleAddresses.java
Lines 151 to 158 in a5a4743
public InetSocketAddress getReachableOrOne() { | |
InetSocketAddress address; | |
// if there is only a single address provided then we don't do any reachability check | |
if (addresses.size() == 1) { | |
return getOne(); | |
} | |
basically the same logic was missing from the MultipleAddresses.getAllReachableAddresses
method.
I should have thinking of it, but forget when I fixed ZOOKEEPER-3698.
@nkalmar this is a good candidate for 3.6.1 please take a look |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@nkalmar can you please merge ?
I messed up my python env, and the script is not running, I'll try to fix it. I can merge after. |
@eolivelli @nkalmar thanks for checking this so quickly! |
Since ZooKeeper 3.6.0 we can specify multiple addresses for each ZooKeeper server instance (this can increase availability when multiple physical network interfaces can be used parallel in the cluster). ZooKeeper will perform ICMP ECHO requests or try to establish a TCP connection on port 7 (Echo) of the destination host in order to find the reachable addresses. This should happen only if the user provide multiple addresses in the configuration, in a single address is used then ZooKeeper shouldn’t send any ICMP requests. This works as we expected for the leader election connections, but in this Jira issue we found a bug when the reachability check was performed even with a single address when the Follower tries to connect to the newly elected Leader. The fix follows the same approach we discussed for the election protocol before (see ZOOKEEPER-3698). We avoid the reachability check for single addresses. Also when we have multiple addresses and none of them can be reached, we still start to connect to all addresses in parallel. Author: Mate Szalay-Beko <[email protected]> Reviewers: Enrico Olivelli <[email protected]>, Norbert Kalmar <[email protected]> Closes #1288 from symat/ZOOKEEPER-3758 (cherry picked from commit b1105cc) Signed-off-by: Norbert Kalmar <[email protected]>
pushed to master and branch-3.6, thanks @symat |
Since ZooKeeper 3.6.0 we can specify multiple addresses for each ZooKeeper server instance (this can increase availability when multiple physical network interfaces can be used parallel in the cluster). ZooKeeper will perform ICMP ECHO requests or try to establish a TCP connection on port 7 (Echo) of the destination host in order to find the reachable addresses. This should happen only if the user provide multiple addresses in the configuration, in a single address is used then ZooKeeper shouldn’t send any ICMP requests. This works as we expected for the leader election connections, but in this Jira issue we found a bug when the reachability check was performed even with a single address when the Follower tries to connect to the newly elected Leader. The fix follows the same approach we discussed for the election protocol before (see ZOOKEEPER-3698). We avoid the reachability check for single addresses. Also when we have multiple addresses and none of them can be reached, we still start to connect to all addresses in parallel. Author: Mate Szalay-Beko <[email protected]> Reviewers: Enrico Olivelli <[email protected]>, Norbert Kalmar <[email protected]> Closes apache#1288 from symat/ZOOKEEPER-3758
Since ZooKeeper 3.6.0 we can specify multiple addresses for each ZooKeeper server instance (this can increase availability when multiple physical network interfaces can be used parallel in the cluster). ZooKeeper will perform ICMP ECHO requests or try to establish a TCP connection on port 7 (Echo) of the destination host in order to find the reachable addresses. This should happen only if the user provide multiple addresses in the configuration, in a single address is used then ZooKeeper shouldn’t send any ICMP requests. This works as we expected for the leader election connections, but in this Jira issue we found a bug when the reachability check was performed even with a single address when the Follower tries to connect to the newly elected Leader. The fix follows the same approach we discussed for the election protocol before (see ZOOKEEPER-3698). We avoid the reachability check for single addresses. Also when we have multiple addresses and none of them can be reached, we still start to connect to all addresses in parallel. Author: Mate Szalay-Beko <[email protected]> Reviewers: Enrico Olivelli <[email protected]>, Norbert Kalmar <[email protected]> Closes apache#1288 from symat/ZOOKEEPER-3758
Since ZooKeeper 3.6.0 we can specify multiple addresses for each ZooKeeper server instance (this can increase availability when multiple physical network interfaces can be used parallel in the cluster). ZooKeeper will perform ICMP ECHO requests or try to establish a TCP connection on port 7 (Echo) of the destination host in order to find the reachable addresses. This should happen only if the user provide multiple addresses in the configuration, in a single address is used then ZooKeeper shouldn’t send any ICMP requests. This works as we expected for the leader election connections, but in this Jira issue we found a bug when the reachability check was performed even with a single address when the Follower tries to connect to the newly elected Leader. The fix follows the same approach we discussed for the election protocol before (see ZOOKEEPER-3698). We avoid the reachability check for single addresses. Also when we have multiple addresses and none of them can be reached, we still start to connect to all addresses in parallel. Author: Mate Szalay-Beko <[email protected]> Reviewers: Enrico Olivelli <[email protected]>, Norbert Kalmar <[email protected]> Closes apache#1288 from symat/ZOOKEEPER-3758
Since ZooKeeper 3.6.0 we can specify multiple addresses for each ZooKeeper server instance (this can increase availability when multiple physical network interfaces can be used parallel in the cluster). ZooKeeper will perform ICMP ECHO requests or try to establish a TCP connection on port 7 (Echo) of the destination host in order to find the reachable addresses. This should happen only if the user provide multiple addresses in the configuration, in a single address is used then ZooKeeper shouldn’t send any ICMP requests. This works as we expected for the leader election connections, but in this Jira issue we found a bug when the reachability check was performed even with a single address when the Follower tries to connect to the newly elected Leader. The fix follows the same approach we discussed for the election protocol before (see ZOOKEEPER-3698). We avoid the reachability check for single addresses. Also when we have multiple addresses and none of them can be reached, we still start to connect to all addresses in parallel. Author: Mate Szalay-Beko <[email protected]> Reviewers: Enrico Olivelli <[email protected]>, Norbert Kalmar <[email protected]> Closes apache#1288 from symat/ZOOKEEPER-3758
Since ZooKeeper 3.6.0 we can specify multiple addresses for each ZooKeeper server instance (this can increase availability when multiple physical network interfaces can be used parallel in the cluster). ZooKeeper will perform ICMP ECHO requests or try to establish a TCP connection on port 7 (Echo) of the destination host in order to find the reachable addresses. This should happen only if the user provide multiple addresses in the configuration, in a single address is used then ZooKeeper shouldn’t send any ICMP requests. This works as we expected for the leader election connections, but in this Jira issue we found a bug when the reachability check was performed even with a single address when the Follower tries to connect to the newly elected Leader. The fix follows the same approach we discussed for the election protocol before (see ZOOKEEPER-3698). We avoid the reachability check for single addresses. Also when we have multiple addresses and none of them can be reached, we still start to connect to all addresses in parallel. Author: Mate Szalay-Beko <[email protected]> Reviewers: Enrico Olivelli <[email protected]>, Norbert Kalmar <[email protected]> Closes apache#1288 from symat/ZOOKEEPER-3758
Since ZooKeeper 3.6.0 we can specify multiple addresses for each ZooKeeper server instance
(this can increase availability when multiple physical network interfaces can be used parallel
in the cluster). ZooKeeper will perform ICMP ECHO requests or try to establish a TCP connection
on port 7 (Echo) of the destination host in order to find the reachable addresses. This should
happen only if the user provide multiple addresses in the configuration, in a single address is
used then ZooKeeper shouldn’t send any ICMP requests.
This works as we expected for the leader election connections, but in this Jira issue we found
a bug when the reachability check was performed even with a single address when the Follower
tries to connect to the newly elected Leader.
The fix follows the same approach we discussed for the election protocol before (see
ZOOKEEPER-3698). We avoid the reachability check for single addresses. Also when we have
multiple addresses and none of them can be reached, we still start to connect to all addresses
in parallel.