Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[release-1.4] 🐛 requeue KCP object if ControlPlaneComponentsHealthyCondition is not yet true #9036

Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 14 additions & 4 deletions controlplane/kubeadm/internal/controllers/controller.go
Original file line number Diff line number Diff line change
Expand Up @@ -207,13 +207,23 @@ func (r *KubeadmControlPlaneReconciler) Reconcile(ctx context.Context, req ctrl.
reterr = kerrors.NewAggregate([]error{reterr, err})
}

// TODO: remove this as soon as we have a proper remote cluster cache in place.
// Make KCP to requeue in case status is not ready, so we can check for node status without waiting for a full resync (by default 10 minutes).
// Only requeue if we are not going in exponential backoff due to error, or if we are not already re-queueing, or if the object has a deletion timestamp.
if reterr == nil && !res.Requeue && res.RequeueAfter <= 0 && kcp.ObjectMeta.DeletionTimestamp.IsZero() {
// Only requeue if there is no error, Requeue or RequeueAfter and the object does not have a deletion timestamp.
if reterr == nil && res.IsZero() && kcp.ObjectMeta.DeletionTimestamp.IsZero() {
// Make KCP requeue in case node status is not ready, so we can check for node status without waiting for a full
// resync (by default 10 minutes).
// The alternative solution would be to watch the control plane nodes in the Cluster - similar to how the
// MachineSet and MachineHealthCheck controllers watch the nodes under their control.
if !kcp.Status.Ready {
res = ctrl.Result{RequeueAfter: 20 * time.Second}
}

// Make KCP requeue if ControlPlaneComponentsHealthyCondition is false so we can check for control plane component
// status without waiting for a full resync (by default 10 minutes).
// Otherwise this condition can lead to a delay in provisioning MachineDeployments when MachineSet preflight checks are enabled.
// The alternative solution to this requeue would be watching the relevant pods inside each workload cluster which would be very expensive.
if conditions.IsFalse(kcp, controlplanev1.ControlPlaneComponentsHealthyCondition) {
res = ctrl.Result{RequeueAfter: 20 * time.Second}
}
}
}()

Expand Down