-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🐛 Prevents reconcileEtcdMember to remove etcd members when etcd starts slowly #3919
🐛 Prevents reconcileEtcdMember to remove etcd members when etcd starts slowly #3919
Conversation
/milestone v0.3.11 |
/test pull-cluster-api-e2e-full-release-0-3 |
/lgtm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
verified that this also fixes the issue with the Azure provider
/lgtm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/approve
/milestone v0.3.11
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: vincepri The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
What this PR does / why we need it:
This PR prevents
reconcileEtcdMember
to remove etcd members when etcd starts slowly.A slow etcd member start, which happens when then the number of etcd members in the cluster grows, determine a longer time between member Add and the actual start of the etcd pod; If during this time
reconcileEtcdMember
was executed, the member got removed, because the member name for a member added but not yet started is empty and thus not matching any node.This PR prevents this problem by ignoring etcd members without a name in
reconcileEtcdMember
; additional, as a further safeguard,reconcileEtcdMember
is now a no-op in case there are machines still provisioning (without a noderef).