Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ControlPlaneInitialized condition is set before a control-plane Node is ready #4936

Closed
dkoshkin opened this issue Jul 13, 2021 · 14 comments · Fixed by #8005
Closed

ControlPlaneInitialized condition is set before a control-plane Node is ready #4936

dkoshkin opened this issue Jul 13, 2021 · 14 comments · Fixed by #8005
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@dkoshkin
Copy link
Contributor

dkoshkin commented Jul 13, 2021

What steps did you take and what happened:

  1. Follow https://cluster-api.sigs.k8s.io/user/quick-start.html for Docker with:
clusterctl generate cluster capi-quickstart --flavor development \
  --kubernetes-version v1.19.11 \
  --control-plane-machine-count=1 \
  --worker-machine-count=1 \
  > capi-quickstart.yaml
  1. Modify the KubeadmControlPlane to force a failure, something like:
apiVersion: controlplane.cluster.x-k8s.io/v1alpha4
kind: KubeadmControlPlane
metadata:
  name: capi-quickstart-control-plane
  namespace: default
spec:
  kubeadmConfigSpec:
    initConfiguration:
      nodeRegistration:
        kubeletExtraArgs:
          node-labels: "{{INVALID_LABELS}}"

The worker machine is in Provisioning, before a control-plane is up:

kubectl get machines
NAME                                    PROVIDERID   PHASE          VERSION
capi-quickstart-control-plane-gbcvx                  Provisioning   v1.19.11
capi-quickstart-md-0-644755b685-spjwc                Provisioning   v1.19.11

Its because the cluster has ControlPlaneInitialized status.

  status:
    conditions:
    - lastTransitionTime: "2021-07-13T17:59:21Z"
      message: Scaling up control plane to 1 replicas (actual 0)
      reason: ScalingUp
      severity: Warning
      status: "False"
      type: Ready
    - lastTransitionTime: "2021-07-13T18:00:03Z"
      status: "True"
      type: ControlPlaneInitialized

Even though there is no nodeRef

kubectl get machines capi-quickstart-control-plane-gbcvx -o yaml | grep nodeRef | wc -l
       0

What did you expect to happen:

According to the API comment ControlPlaneInitializedCondition should be marked True when:

cluster's apiserver is reachable and at least one control-plane Machine has a node reference

There is 1 controller that checks for the NodeRef

for _, m := range machines {
if util.IsControlPlaneMachine(m) && m.Status.NodeRef != nil {
conditions.MarkTrue(cluster, clusterv1.ControlPlaneInitializedCondition)
return ctrl.Result{}, nil
}
}

But another one will set to True with only KubeadmControlPlane.initialized = true:

// Update cluster.Status.ControlPlaneInitialized if it hasn't already been set
// Determine if the control plane provider is initialized.
if !conditions.IsTrue(cluster, clusterv1.ControlPlaneInitializedCondition) {
initialized, err := external.IsInitialized(controlPlaneConfig)
if err != nil {
return ctrl.Result{}, err
}
if initialized {
conditions.MarkTrue(cluster, clusterv1.ControlPlaneInitializedCondition)
} else {

I think both should check for a control-plane Machine to have a NodeRef.

Anything else you would like to add:
[Miscellaneous information that will assist in solving the issue.]

It most cases it may be undesirable to start bootstrapping worker machines before a control-plane (and API server) is up since it may cause a lot of churn as the workers will fail to start.

Environment:

  • Cluster-api version: v0.4.0
  • Minikube/KIND version:
  • Kubernetes version: (use kubectl version):
  • OS (e.g. from /etc/os-release):

/kind bug
[One or more /area label. See https://github.com/kubernetes-sigs/cluster-api/labels?q=area for the list of labels]

@k8s-ci-robot k8s-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Jul 13, 2021
@vincepri
Copy link
Member

/assign @fabriziopandini
/milestone v0.4

Just a note that the two codepaths linked above are mutually exclusive, although I do agree that they have different definitions and this should be fixed.

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 17, 2021
@vincepri vincepri modified the milestones: v0.4, v1.1 Oct 22, 2021
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Nov 21, 2021
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue or PR with /reopen
  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

@k8s-ci-robot
Copy link
Contributor

@k8s-triage-robot: Closing this issue.

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue or PR with /reopen
  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@vincepri
Copy link
Member

/reopen
/lifecycle frozen

@k8s-ci-robot
Copy link
Contributor

@vincepri: Reopened this issue.

In response to this:

/reopen
/lifecycle frozen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot reopened this Dec 21, 2021
@k8s-ci-robot k8s-ci-robot added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. labels Dec 21, 2021
@fabriziopandini fabriziopandini modified the milestones: v1.1, v1.2 Feb 3, 2022
@fabriziopandini
Copy link
Member

/unassign

@13164815445
Copy link

I am very interested in this project, can I try to solve this problem?

@killianmuldoon
Copy link
Contributor

@13164815445 Welcome and thanks for your interest! It might be a good idea to tackle on of our good first issues initially to get used to the workflow in contributing to Cluster API.

@13164815445
Copy link

thanks! can i assign this issue to me? @killianmuldoon

@13164815445
Copy link

/assign @13164815445

@fabriziopandini fabriziopandini added the triage/accepted Indicates an issue or PR is ready to be actively worked on. label Jul 29, 2022
@fabriziopandini fabriziopandini removed this from the v1.2 milestone Jul 29, 2022
@fabriziopandini fabriziopandini removed the triage/accepted Indicates an issue or PR is ready to be actively worked on. label Jul 29, 2022
@fabriziopandini
Copy link
Member

/triage accepted

@k8s-ci-robot k8s-ci-robot added the triage/accepted Indicates an issue or PR is ready to be actively worked on. label Oct 3, 2022
@killianmuldoon
Copy link
Contributor

/unassign @13164815445

/assign @killianmuldoon

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
7 participants