-
Notifications
You must be signed in to change notification settings - Fork 7.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ZOOKEEPER-3095] Connect string fix for non-existent hosts #579
[ZOOKEEPER-3095] Connect string fix for non-existent hosts #579
Conversation
ZKPatch: eda58d9970c76831046ddc45251c9b110856836e (extract)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good @mjeelanimsft just minor formatting
host = strtok_r(0, ",", &strtok_last); | ||
} | ||
#endif | ||
} | ||
if (avec->count == 0) { | ||
rc = ZSYSTEMERROR; // not a single host resolved |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
4 space indent ;)
/* Checks that a non-existent host will not block the connection*/ | ||
void testNonexistentHost() { | ||
char hosts[] = "jimmy:5555,127.0.0.1:22181"; | ||
watchctx_t ctx; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
4 space indent
@@ -77,7 +77,7 @@ fi | |||
|
|||
if [ "x${base_dir}" == "x" ] | |||
then | |||
zk_base="../../" | |||
zk_base="../../../" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this correct? how did this work before?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and is it related to this change? ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, seems the old code is not working if the base_dir is not specified.
We were using this to test the change manually, so it's related from testing purpose.
hey @mjeelanimsft when i did the commit, your email address came up wrong. i think i fixed it, but you might want to check your git config. thanx for the submission! |
Thanks for the commit @breed ! My email was indeed not set properly; thanks for letting me know - I've fixed it now. |
Any chance this can be backported to 3.5.x? We're hitting this in production (ZK runs in Kubernetes, and DNS doesn't resolve for pods that are down.) cc: @igorcanadi |
@tudor No problem. Would you mind cherry picking to branch-3.5 and submitting a pull request? |
The change is backported from Jeelani Mohamed Abdul Khader <[email protected]>'s apache#579 change. Code changes to not fail the session initiation if the DNS is not able to resolve the hostname of one of the servers in the Zookeeper ensemble. The Zookeeper client resolves all the hostnames in the ensemble while establishing the session. In Kubernetes environment with coreDNS, the hostname entry gets removed from coreDNS during the POD restarts. Though we can manipulate the coreDNS settings to delay the removal of the hostname entry from DNS, we don't want to leave any race where Zookeeper client is trying to establish a session and it fails because the DNS temporarily is not able to resolve the hostname. So as long as one of the servers in the ensemble is able to be DNS resolvable, should we not fail the session establishment with hard error and instead try to establish the connection with one of the other servers
… DNS does not resolve one of the servers in the zk ensemble The change is backported from Jeelani Mohamed Abdul Khader <mjeelanidevvm3360.prn2.facebook.com>'s #579 change done in "ZOOKEEPER-3095: Connect string fix for non-existent hosts" Code changes to not fail the session initiation if the DNS is not able to resolve the hostname of one of the servers in the Zookeeper ensemble. Some Background: The Zookeeper client resolves all the hostnames in the ensemble while establishing the session. In Kubernetes environment with coreDNS, the hostname entry gets removed from coreDNS during the POD restarts. Though we can manipulate the coreDNS settings to delay the removal of the hostname entry from DNS, we don't want to leave any race where Zookeeper client is trying to establish a session and it fails because the DNS temporarily is not able to resolve the hostname. So as long as one of the servers in the ensemble is able to be DNS resolvable, should we not fail the session establishment with hard error and instead try to establish the connection with one of the other servers Author: Suhas Dantkale <[email protected]> Reviewers: Enrico Olivelli <[email protected]>, Mate Szalay-Beko <[email protected]>, Damien Diederen <[email protected]> Closes #1259 from suhasdantkale/suhas/ZOOKEEPER-3723
ZKPatch: eda58d9970c76831046ddc45251c9b110856836e (extract) Author: Jeelani Mohamed Abdul Khader <[email protected]> Reviewers: Benjamin Reed <[email protected]> Closes apache#579 from mjeelanimsft/connect-string-fix-for-non-existent-hosts
ZKPatch: eda58d9970c76831046ddc45251c9b110856836e (extract)