-
Notifications
You must be signed in to change notification settings - Fork 9.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
All proxies leave the etcd2 cluster upon quorum loss ?? (4 nodes, version 2.0.12) #3580
Comments
From the line that you pointed to me Moreover, could you check the member list in your etcd cluster? |
@kayrus Kindly ping |
@yichengq , interesting, it looks like gist cut the the "to" part of that line, which read something like [http://127.0.0.1:2380] |
The bug described here happened in CoreOS 723.3. |
The story here is that proxy always refresh its target endpoints to advertised client urls at /v2/members in one random etcd member. Considering you started some faulty server, and it may have In conclusion, I think this is caused by the misoperations. |
@xiang90 asked customer for detailed logs https://groups.google.com/forum/#!topic/coreos-user/OuqvJIRAtho |
@domq Any update? |
I am closing this due to low activity. I think @yichengq has an answer. |
@domq
I've copied from google groups
Hello fellow CoreOS users, we[1] are busy setting up a CoreOS cluster on bare metal (60 nodes or so) for a university in Switzerland.
We ran into a somewhat worrying failure scenario with etcd2 version 2.0.12, and a cluster of 4 nodes. An excerpt of the logs of node #6 (which was configured as a proxy) is at https://gist.github.com/domq/e23e08fab098d915f88f.
From what I can figure out, the following happened:
When we restored the quorum today[2], node #6 stayed in that same state. Restarting it after purging /var/lib/etcd2 kicked it back into shape, I think. All other proxy nodes had the same fate, i.e. fleetctl only shows them back into the cluster after we "systemctl stop etcd2; rm -rf /var/lib/etcd2/proxy; systemctl start etcd2" them.
Could someone please explain whether event 4. above is a bug?
@yichengq @xiang90 probably relates to #3215
The text was updated successfully, but these errors were encountered: