-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add available condition to control plane provider contract #3779
Comments
/milestone v0.4.0 |
Thinking a bit more, I wonder if what I really want is a Cluster condition for ControlPlaneAvailable, and the ClusterReconciler can set this when it's able to talk to the workload cluster apiserver? I don't know this necessarily or strictly needs to be in the control plane provider? |
+1 to making this a condition on the Cluster |
That's what KubeadmControlPlane actually does today (more or less), we could write a fallback to do this generically by pinging the control plane? |
Basically what I think I want/need is to make cluster.status.controlPlaneInitialized a condition (can change the name if needed), and have it be set by the ClusterReconciler. No need to do anything in a control plane provider here. Right? |
Initialized was a bit different than availability when we originally discussed it. If our goal is to provide something generic for saying "the control plane is available", consider something like:
What do you think? |
Hmm what does "available" mean? 😄 Let me come back later with some more detailed thoughts... |
|
Going to move some of this discussion over to #3026 and will circle back here later |
Yes, it was initially intended mostly as a stop gap for the purposes of allowing migration and continuation of support for v1alpha2 Machine-based control planes to v1alpha3 and KubeadmControlPlane.
My initial thoughts around this were to provide a signal that the ControlPlane (API Server) should be available for requests, so that it could block/unblock operations that require the control plane being available (such as creating worker Machines) in a way that is more accurate than I think
Eventually it might also be good to determine if the apiserver is in a state where resources can be created/modified, in which case it might be good to bubble up a signal from a control plane provider (if used). For KCP, this could simply bubble up that etcd has quorum. This could fall back to either assuming things are good or creating/deleting an arbitrary resource in the case that a control plane provider is not used. |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
/lifecycle frozen |
/assign @fabriziopandini |
/unassign |
/triage accepted |
@fabriziopandini: GuidelinesPlease ensure that the issue body includes answers to the following questions:
For more details on the requirements of such an issue, please see here and ensure that they are met. If this request no longer meets these requirements, the label can be removed In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
This issue has not been updated in over 1 year, and should be re-triaged. You can:
For more details on the triage process, see https://www.kubernetes.dev/docs/guide/issue-triage/ /remove-triage accepted |
/priority important-longterm |
/assign I think we made some progress here but we forgot to update the issue. |
The original ask
Has been already addressed in cluster-api/internal/controllers/machinehealthcheck/machinehealthcheck_targets.go Lines 149 to 158 in c19ca28
WRT to Initialized vs Available:
Considered that IMO the most practical way forwards is
@sbueringer @chrischdi opinions? |
Sounds good
Sounds good to me, with the next apiVersion obviously. The only thing that is not ideal is that this basically goes slightly backwards on our transition from status fields to conditions (the transition doesn't actually happen at the moment though). |
/close
Agreed, added a comment on #10532:
|
@fabriziopandini: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
User Story
As a developer, I would like to know when the control plane is first reachable for a new cluster, so that I can make decisions about MachineHealthCheck.spec.nodeStartupTimeout for joining control plane machines.
Detailed Description
When trying to determine if a joining control plane machine has exceeded a MachineHealthCheck's nodeStartupTimeout, we need to delay checking for this condition until after the control plane is first reachable. In other words, a joining control plane machine must not be marked unhealthy by the MachineHealthCheck if the apiserver is unreachable / it's still be initialized. After the apiserver is reachable, the nodeStartupTimeout should most likely be measured from when the machine's infrastructure was marked ready.
This proposal is to add a new required status condition to all control plane providers,
Available
, that indicates the first control plane node has completed bootstrapping and the apiserver is reachable. This is currently present in the KubeadmControlPlane provider, but it's not part of the control plane provider contract.See also #3026
/kind feature
/area control-plane
/kind api-change
The text was updated successfully, but these errors were encountered: