Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature: cluster health init-container similar to how prepare is leveraged for k3s-upgrade #169

Open
dweomer opened this issue Nov 24, 2021 · 2 comments
Labels
enhancement New feature or request

Comments

@dweomer
Copy link
Contributor

dweomer commented Nov 24, 2021

Is your feature request related to a problem? Please describe.
Some upgrade use-cases require that the cluster "be healthy" before incurring the disruption of a node upgrade. It would be nice to configure a Plan such that some settling has occurred before it continues with the next node. This could be achieved by some sort of health measurement, possibly ensuring that all replicasets and daemonsets have a minimum number of pods running, etc.

Describe the solution you'd like
A parameter or two on the Plan spec indicating that some health measurement should pass before commencing with node upgrade(s) and what pre-canned strategy to use for making such a determination. Maybe the presence of a strategy choice other than "none" would be enough (so, one parameter).

Describe alternatives you've considered
Relying on the eviction algorithm that respects pod disruption budgets (aka NOT specifying .spec.disableEviction) will likely not be adequate for all upgrade needs because such can hang indefinitely in resource-constrained clusters. Because of this we must assume that some disruptions can and will happen from upgrade plan applies. Is this enough to warrant new logic in the controller? 🤷

Additional context

@dweomer dweomer added the enhancement New feature or request label Nov 24, 2021
@psy-q
Copy link

psy-q commented Mar 28, 2023

We could use a lightweight version of this where we can at least specify a delay between node upgrades so that we avoid having multiple nodes rebooting before a StatefulSet with 3 pods is ready again. Is there a delay option already that we missed, e.g. 30 minutes between node upgrades?

The issue we have is that this workload is tightly coupled to specific nodes, so if SUC just goes ahead and reboots one after the other, even if the pods could be rescheduled to another node to meet their PDB, they won't be because they need to be scheduled on the exact same node again.

As it takes 15-20 minutes for a pod to become ready and reconnect to its cluster friends, SUC has cheerfully rebooted all three nodes by that time, destroying the application's clustering mode. It can't deal with more than one cluster member being unavailable at any one time.

@dweomer
Copy link
Contributor Author

dweomer commented Apr 10, 2023

We could use a lightweight version of this where we can at least specify a delay between node upgrades so that we avoid having multiple nodes rebooting before a StatefulSet with 3 pods is ready again. Is there a delay option already that we missed, e.g. 30 minutes between node upgrades?

The issue we have is that this workload is tightly coupled to specific nodes, so if SUC just goes ahead and reboots one after the other, even if the pods could be rescheduled to another node to meet their PDB, they won't be because they need to be scheduled on the exact same node again.

As it takes 15-20 minutes for a pod to become ready and reconnect to its cluster friends, SUC has cheerfully rebooted all three nodes by that time, destroying the application's clustering mode. It can't deal with more than one cluster member being unavailable at any one time.

IIRC, SUC will honor existing PDB if such exists.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants