Optimize CAPD machine controller to reduce noisy error messages #8086

sbueringer · 2023-02-08T11:11:17Z

When implementing #7963 we discoverd that the CAPD machine reconciler is producing a lot of error messages when the APIserver of the workload cluster is not reachable yet. Roughly here: https://github.com/kubernetes-sigs/cluster-api/blob/main/test/infrastructure/docker/internal/controllers/dockermachine_controller.go#L337-L340

We tried to add the following statement above, but this breaks the standalone machine case (covered by the KCP adoption e2e test):

	// If the control plane is not yet initialized, there is no API server to contact to get the ProviderID for the Node
	// hosted on this machine, so return early.
	// NOTE: we are using RequeueAfter with a short interval in order to make test execution time more stable.
	if !conditions.IsTrue(cluster, clusterv1.ControlPlaneInitializedCondition) {
		return ctrl.Result{RequeueAfter: 15 * time.Second}, nil
	}

The problem is that in the standalone case we have to set the providerID and only afterwards the ControlPlaneInitialized condition is set to true. So with this statement we're in a deadlock.

Options:

only requeue if the cluster is using a control plane provider

/kind cleanup

sbueringer · 2023-02-08T11:13:10Z

cc @fabriziopandini
Hope I captured it correctly

fabriziopandini · 2023-02-08T17:25:45Z

/triage accepted

only requeue if the cluster is using a control plane provider

Makes sense for me

sbueringer · 2023-02-08T18:01:01Z

/assign

Should have a few minutes to open a PR relatively soon

k8s-ci-robot added kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Feb 8, 2023

k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Feb 8, 2023

k8s-ci-robot assigned sbueringer Feb 8, 2023

sbueringer mentioned this issue Feb 9, 2023

🌱 CAPD: reduce noisy error logs during machine reconciliation #8090

Merged

k8s-ci-robot closed this as completed in #8090 Feb 12, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize CAPD machine controller to reduce noisy error messages #8086

Optimize CAPD machine controller to reduce noisy error messages #8086

sbueringer commented Feb 8, 2023 •

edited

Loading

sbueringer commented Feb 8, 2023

fabriziopandini commented Feb 8, 2023

sbueringer commented Feb 8, 2023

Optimize CAPD machine controller to reduce noisy error messages #8086

Optimize CAPD machine controller to reduce noisy error messages #8086

Comments

sbueringer commented Feb 8, 2023 • edited Loading

sbueringer commented Feb 8, 2023

fabriziopandini commented Feb 8, 2023

sbueringer commented Feb 8, 2023

sbueringer commented Feb 8, 2023 •

edited

Loading