-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Provide a way to hook into the rolling update #4229
Comments
/milestone v0.4.0 |
@fabriziopandini: The label(s) In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@sbueringer Are you interested in having this change for v1alpha4? |
@vincepri Yes. |
There is some good aspects that we could define as part of the MachineHealthCheck (maybe?) or similar struct. For example, for generic "pod readiness", we could require that a certain number of pods show up as ready before proceeding with an update. Then there is some customizable aspect, for things that aren't generic and would need custom code to be done, which could be done with special annotations. |
Nevertheless, this effort might require a small proposal to continue, and if we're aiming for v1alpha4, it should be made non-breaking and delivered potentially in a patch release. /milestone v0.4.x |
@vincepri sounds good. I'll think about it and come back with a few ideas. Just to limit the scope of the proposal a bit. Just that I get it right. You meant specifying it as part of |
Yes, and if we have some data structure that it allows us to define pod-based health checks, we could use the same to define checks in MachineDeployment and KCP -- and possibly share the codebase too :) |
Just a short update. I'm still interested in this. Just taking a bit of time to get more familiar with the current state of the project, to be able to better judge how this fits. |
Currently not really working on it |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle rotten |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /close |
@k8s-triage-robot: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/reopen |
@vincepri: Reopened this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Can we mark this as done as part of #6546? @sbueringer WDYT? |
👍 for ControlPlane nodes. Regarding MachineDeployments: it would currently not be possible to do things on a per MachineDeployment base, only on "after all MachineDeployments" base. It also would not reflect state to MachineDeployments. But this points are up for discussion I think :-) |
I don't think that Runtime Hooks today works for the described use case above. The idea was to do additional actions per-machine. |
But I'm also fine with just closing this issue. I don't have that requirement anymore and there was no other demand for it in the last almost 1,5 years. |
Given the feedback above let's leave it open - it seems like this might be a use case for an additional runtime hook if someone from the community is interested in it. |
/assign @faiq i'll try my hand at it. |
/triage accepted |
@faiq are you still working on this? I'm very interested in a hook on a machine level |
@bavarianbidi please by all means go ahead. i haven't had as much time to work on this as I'd like! |
/unassign |
/assign @bavarianbidi |
it seems there are some overlaps between what is discussed in this issue and #7647, might be better to reconcile the two efforts before moving on with implementation |
This issue has not been updated in over 1 year, and should be re-triaged. You can:
For more details on the triage process, see https://www.kubernetes.dev/docs/guide/issue-triage/ /remove-triage accepted |
/unassign as i didn't work with CAPI ATM 😞 I've already pitched the issue to some colleagues and shared a very rough implementation with them - so 🤞 they will jump into this 🙏 |
/priority backlog |
The Cluster API project currently lacks enough active contributors to adequately respond to all issues and PRs. If there is concrete interest to make this move forward and some more details to discuss about what we want to do, we can re-open /close |
@fabriziopandini: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
User Story
As a service provider, I would like to be able to hook into the rolling update of control plane
and worker nodes to be able to implement additional checks before the rolling update continues.
Detailed Description
Some context: we're providing Kubernetes as a Service and are currently reimplementing our
existing solution. To provide a smooth update experience we have several checks which verify if
a Node is ready to be used by customers. In our current solution we're updating Nodes sequentially
after we updated a Node, we verify that it's completely functional and then continue the update by
deleting and creating the next Node etc. .
As far as I'm aware, there is currently no way to do this in CAPI. To me it looks like e.g. the
MachineSet controller only evaluates the Ready condition of the Node in the workload cluster when
deciding if a Machine is ready (code). I assume
the Machine readiness then also informs the decision when the next Machine will be updated etc..
Some examples what we're considering right now before we declare a Node ready:
An important point is that we always want to ensure we have minimum amount of "completely" ready Nodes.
Anything else you would like to add:
References:
/kind feature
The text was updated successfully, but these errors were encountered: