Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document MachineHealthChecks for control planes #4138

Closed
CecileRobertMichon opened this issue Feb 3, 2021 · 9 comments · Fixed by #4228
Closed

Document MachineHealthChecks for control planes #4138

CecileRobertMichon opened this issue Feb 3, 2021 · 9 comments · Fixed by #4228
Labels
help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/documentation Categorizes issue or PR as related to documentation.
Milestone

Comments

@CecileRobertMichon
Copy link
Contributor

Since #3956 merged, KCP supports remediation with machine health checks. However, the documentation on the topic is missing. In fact, the current MHC docs state:

Control Plane Machines are currently not supported and will not be remediated if they are unhealthy

https://cluster-api.sigs.k8s.io/tasks/healthcheck.html#limitations-and-caveats-of-a-machinehealthcheck

We should update the existing documentation for MachineHealthChecks to include KCP.

/kind documentation

@k8s-ci-robot k8s-ci-robot added the kind/documentation Categorizes issue or PR as related to documentation. label Feb 3, 2021
@CecileRobertMichon
Copy link
Contributor Author

/help

@k8s-ci-robot
Copy link
Contributor

@CecileRobertMichon:
This request has been marked as needing help from a contributor.

Please ensure the request meets the requirements listed here.

If this request no longer meets these requirements, the label can be removed
by commenting with the /remove-help command.

In response to this:

/help

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. label Feb 3, 2021
@CecileRobertMichon
Copy link
Contributor Author

/milestone v0.4.0

@k8s-ci-robot k8s-ci-robot added this to the v0.4.0 milestone Feb 3, 2021
@CecileRobertMichon CecileRobertMichon modified the milestones: v0.4.0, v0.4.x Feb 19, 2021
@scottslowe
Copy link
Contributor

@CecileRobertMichon I'd like to work on this. Aside from removing the text indicating that MHC doesn't apply to KCP, what other changes do you feel are necessary?

@CecileRobertMichon
Copy link
Contributor Author

@scottslowe awesome, thank you. I think in addition to removing the text of the limitation, it would be good to add an example of how to set it up for KCP in https://cluster-api.sigs.k8s.io/tasks/healthcheck.html#creating-a-machinehealthcheck. You can use https://github.com/kubernetes-sigs/cluster-api/blob/master/test/e2e/data/infrastructure-docker/v1alpha4/cluster-template-kcp-remediation/mhc.yaml as an example.

@scottslowe
Copy link
Contributor

@CecileRobertMichon In the "Limitations and Caveats" section of the current doc, there is a bullet that says "If no Node joins the cluster for a Node after the NodeStartupTimeout, the Machine will be remediated"

Should this be "If no Node joins the cluster for a Machine...", or "If no Machine joins the cluster for a Node..."? The current wording seems off.

@CecileRobertMichon
Copy link
Contributor Author

yeah I think it should be "If no Node joins the cluster for a Machine..."

@scottslowe
Copy link
Contributor

PR submitted and ready for review!

@CecileRobertMichon CecileRobertMichon modified the milestones: v0.4.x, v0.4 Mar 22, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/documentation Categorizes issue or PR as related to documentation.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants