Skip to content

Commit

Permalink
address commits
Browse files Browse the repository at this point in the history
  • Loading branch information
fabriziopandini committed Jan 8, 2023
1 parent feadb62 commit 8f52b5f
Showing 1 changed file with 4 additions and 4 deletions.
8 changes: 4 additions & 4 deletions docs/proposals/20191017-kubeadm-based-control-plane.md
Original file line number Diff line number Diff line change
Expand Up @@ -473,17 +473,17 @@ When `MaxSurge` is set to 0 the rollout algorithm is as follows:

- Following rules should be satisfied in order to start remediation
- One of the following apply:
- The cluster MUST one machine not yet initialized (the failure happens before KCP reaches the initialized state)
- The cluster MUST not be initialized yet (the failure happens before KCP reaches the initialized state)
- The cluster MUST have at least two control plane machines, because this is the smallest cluster size that can be remediated.
- Previous remediation (delete and re-create) MUST have been completed. This rule prevents KCP to remediate more machine while the
- Previous remediation (delete and re-create) MUST have been completed. This rule prevents KCP to remediate more machines while the
replacement for the previous machine is not yet created.
- The cluster MUST have no machines with a deletion timestamp. This rule prevents KCP taking actions while the cluster is in a transitional state.
- Remediation MUST preserve etcd quorum. This rule ensures that we will not remove a member that would result in etcd
losing a majority of members and thus become unable to field new requests.

- Additionally following opt-in safeguard will be put in place:
- Additionally following opt-in safeguards will be put in place:
- If we are remediating the same machine (delete, re-create, replacement machine gets unhealthy), it will be possible
to define a maximum number of retries, thus preventing un-necessary load on infrastructure provider e.g. in case of quota problems.
to define a maximum number of retries, thus preventing unnecessary load on infrastructure provider e.g. in case of quota problems.
- If we are remediating the same machine (delete, re-create, replacement machine gets unhealthy), it will be possible
to define a delay between each retry, thus allowing the infrastructure provider to stabilize in case of temporary problems.

Expand Down

0 comments on commit 8f52b5f

Please sign in to comment.