Add MachineHealthCheck to our templates #144

joekr · 2022-09-09T17:43:45Z

What would you like to be added:
We should add some basic MHC to our templates so that unhealthy nodes can be handled
MHC docs: https://cluster-api.sigs.k8s.io/tasks/automated-machine-management/healthchecking.html

We should provide a MachineHealthCheck definition for both the control-plane and the machines (excluding machine pools). We will set sane defaults for now.

For more details on this see the proposal: https://github.com/kubernetes-sigs/cluster-api/blob/main/docs/proposals/20191030-machine-health-checking.md and the feature PR: kubernetes-sigs/cluster-api#3830

Why is this needed:
If a node fails to launch or is terminated for some reason we want CAPI to bring up a new node.

The text was updated successfully, but these errors were encountered:

joekr · 2022-09-09T18:30:26Z

In working with MHC I found that setting nodeStartupTimeout: 10m helps as an instance can take a while to come up. `10m1 might be too long, but we should set something higher than whatever the default is.

joekr added the enhancement New feature or request label Sep 9, 2022

joekr self-assigned this Sep 13, 2022

joekr mentioned this issue Sep 29, 2022

feat: Add MachineHealthCheck example template #175

Merged

joekr closed this as completed in #175 Sep 30, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add MachineHealthCheck to our templates #144

Add MachineHealthCheck to our templates #144

joekr commented Sep 9, 2022

joekr commented Sep 9, 2022

Add MachineHealthCheck to our templates #144

Add MachineHealthCheck to our templates #144

Comments

joekr commented Sep 9, 2022

joekr commented Sep 9, 2022