Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🏃 Wire up kubeadm control plane Ready status #2488

Merged
merged 1 commit into from
Mar 4, 2020

Conversation

chuckha
Copy link
Contributor

@chuckha chuckha commented Feb 28, 2020

Signed-off-by: Chuck Ha [email protected]

What this PR does / why we need it:
This PR aligns the KCP Initialized field with the comment on the field in the API type as well as sets the Ready status when it is appropriate to set.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #2480

I'm holding this because I might have to make changes to the capd e2es now that we're actually setting Ready based on NodeReady which requires CNI to exist on the cluster. CAPD may now how to install a CNI to pass which it really should do anyway in order to scale machine deployments.

In general, my experience with the tests for the reconciler are that we are testing many things throughout the whole Reconcile function. This is good in some ways and it's bad in others. There must be some way to make these tests more readable/understandable because it takes me forever to understand what it's doing and why an assertion is being made.

@k8s-ci-robot k8s-ci-robot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Feb 28, 2020
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: chuckha

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 28, 2020
@CecileRobertMichon
Copy link
Contributor

nit: there's a typo in the commit message should be control plane

@chuckha
Copy link
Contributor Author

chuckha commented Feb 28, 2020

nit: there's a typo in the commit message should be control plane

🦅 👀

@chuckha chuckha changed the title 🏃 Wire up kubeadm control plnae Ready status 🏃 Wire up kubeadm control plane Ready status Feb 28, 2020
Copy link
Contributor

@sethp-nr sethp-nr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed about the CNI piece: In my experience MachineDeployments also get stuck in weird ways if there's no CNI (so there's no chance of a Ready Node).

/lgtm

}
}
r.Log.Info("unable to get workload cluster", "err", fmt.Sprintf("%s", err))
return nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be a capierrors.RequeueAfterError, do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right that this change set would modify behavior in a subtle way. I think this should handle the error cases exactly as they were being handled before and so not change any behavior of this function.

Do you think a capierrors.RequeueAfterError is more appropriate than returning the errors.Wrap?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think my worry with not using RequeueAfterError here is that we could hit the 10 retries during the normal error path prior to the remote APIServer being available, which would lead to needing to wait for the resync interval to continue updating the status (or really any other operations after the creation of the initial control plane instance), while waiting for that initial machine to finish booting/initializing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way the defer is structured there is no way to send up a requeue after error from updateStatus. If you think this should definitely use requeueAfter then we'll need a new issue / PR to refactor how the defer works

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 29, 2020
@vincepri vincepri added this to the v0.3.0-rc.3 milestone Mar 2, 2020
@chuckha
Copy link
Contributor Author

chuckha commented Mar 2, 2020

Agreed about the CNI piece: In my experience MachineDeployments also get stuck in weird ways if there's no CNI (so there's no chance of a Ready Node).

yeah, machine deployments require CNI to get nodes into a ready state for scaling to work appropriately

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 2, 2020
@chuckha
Copy link
Contributor Author

chuckha commented Mar 3, 2020

/hold cancel

This is good for review

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 3, 2020
@chuckha chuckha force-pushed the kcp-read branch 2 times, most recently from 7aef971 to 4f0d8d7 Compare March 3, 2020 14:47
Eventually(func() (bool, error) {
machineList := &clusterv1.MachineList{}
if err := input.Lister.List(ctx, machineList, inClustersNamespaceListOption, matchClusterListOption); err != nil {
fmt.Println(err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leftover?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🙈

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually no, it's not leftover, this is the pattern used throughout. These fmt.Printlns should be replaced with a writer to ginkgo's log. I'm going to keep this and file an issue for improvement.

@chuckha
Copy link
Contributor Author

chuckha commented Mar 3, 2020

with this issue, i believe this is good to merge #2514

@vincepri got time for one more 👁 👃 👁?

@rudoi
Copy link
Contributor

rudoi commented Mar 4, 2020

/lgtm

just a lovely piece of code

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 4, 2020
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 4, 2020
Copy link
Member

@vincepri vincepri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 4, 2020
@k8s-ci-robot k8s-ci-robot merged commit 88506f8 into kubernetes-sigs:master Mar 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Wire up KubeadmControlPlane Ready
7 participants