Add KEP skeleton & initial proposal for in-place update of Pod resources #2908

kgolab · 2018-11-06T14:25:47Z

This proposal aims at allowing Pod resource requests & limits to be updated in-place, without a need to restart the Pod or its Containers.

This commit is aimed at starting KEP process.

k8s-ci-robot · 2018-11-06T14:25:55Z

Hi @kgolab. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot · 2018-11-06T14:26:04Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
To fully approve this pull request, please assign additional approvers.
We suggest the following additional approver: calebamiles

If they are not already assigned, you can assign the PR to them by writing /assign @calebamiles in a comment when ready.

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

keps/OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

thockin · 2018-11-06T18:55:05Z

Can you detail here the relationship with #1719 which has a lot of feedback already? The doc says it builds upon it, but does that mean people need to digest that one first? Or that this replaces that one?

kgolab · 2018-11-07T16:23:11Z

Regarding #1719 - this KEP partially supersedes it.

partially since proposal for live and in-place vertical scaling #1719 was concerned also with StatefulSet-specific scenarios which are not covered by this KEP,
supersedes as we'd like to suggest using the in-place update described in this KEP instead of the option presented in proposal for live and in-place vertical scaling #1719 while leaving StatefulSet specifics to that PR.

The goal of this KEP is to provide building blocks for Controllers wanting to use in-place resource update but not concentrate on any specific Controller.

jdumars · 2018-11-15T08:12:04Z

/ok-to-test

davidopp · 2018-11-15T10:29:20Z

@bsalamat

bsalamat · 2018-11-16T02:12:52Z

/sig scheduling

bsalamat · 2018-11-16T02:13:22Z

/assign

justaugustus · 2018-11-20T04:35:34Z

REMINDER: KEPs are moving to k/enhancements on November 30. Please attempt to merge this KEP before then to signal consensus.
For more details on this change, review this thread.

Any questions regarding this move should be directed to that thread and not asked on GitHub.

justaugustus · 2018-12-01T08:03:41Z

KEPs have moved to k/enhancements.
This PR will be closed and any additional changes to this KEP should be submitted to k/enhancements.
For more details on this change, review this thread.

Any questions regarding this move should be directed to that thread and not asked on GitHub.
/close

k8s-ci-robot · 2018-12-01T08:03:43Z

@justaugustus: Closed this PR.

In response to this:

KEPs have moved to k/enhancements.
This PR will be closed and any additional changes to this KEP should be submitted to k/enhancements.
For more details on this change, review this thread.

Any questions regarding this move should be directed to that thread and not asked on GitHub.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

vinaykul · 2018-12-03T05:09:44Z

keps/sig-architecture/draft-20181106-in-place-update-of-pod-resources.md

+  but might be possible if some conditions change,
+* Rejected - resource update was rejected by any of the components involved.
+
+To provide some fine-grained control to the user,


Please review our early thoughts below:

Policy controls: After going through pull 1719, we feel it may be a good idea to having two distinct levels of policy control:

Pod level resize policy (from Slawomir’s feedback in our document) where scheduler determines resize action at pod level. Proposed policy values:

InPlacePreferred (default) - Resize the pod on current node if possible, reschedule if not.

InPlaceOnly - Resize the pod on current node, fail the request if on possible.

Reschedule - Always reschedule the pod (For potential use by VPA ‘Recreate’ mode)

Container level restart policy discussed in pull 1719 - cpu/memory restart/live-resize. To make it simpler, it may be sufficient to have ‘LiveResize’ and ‘Restart’ (default) policy options for cpu/memory that dictates whether a particular container will be restarted or resource-updated depending on the policy + resource-type affected. If UpdateContainerResources CRI API fails, restart container as a fallback. This should cover cases of jvm / legacy apps that cannot handle UpdateContainerResources. This policy is orthogonal to pod level resize policy above.

KEP design: I need to go over this more thoroughly. At first, we see the following flow:

Resources update to a controller’s Template.PodSpec will be propagated into PodSpec Resource updates for its running pod instances by the controller, and it sets PodStatus.Conditions[]type=PodResizeResources to ResizeRequested.

Scheduler will use sum(PodSpec.Containers[].Resources) to perform pod resources accounting in updatePod (removePod / addPod) and decide the action. If it fits on current node, it sets PodStatus.Contitions[type=PodResizeResources] to ActionUpdate.

Kubelet acts on the ActionUpdate, and applies the declarative values in UpdateContainerResources (or restart container per policy). Kubelet sets PodStatus.ContainerStatuses[].ResourcesAllocated to the declarative value, and sets PodResizeResources condition to Complete/Done, or Failed on any errors.

Handling multiple scheduler race-condition: Kubelet reruns pod admission predicates during HandlePodUpdates (perhaps just running PodFitsResources might suffice). If fit == false, kubelet reschedules the pod if pod resize policy == InPlacePreferred, and fails the operation if InPlaceOnly.

Handling failure with roll-back & retry rather than letting user handle failures: Resizing may fail at scheduler due to pod disruption budget or insufficient node resources gated by policy, or at kubelet due to multiple scheduler race condition with InPlaceOnly pod resize policy. We feel it may be worth doing a smart retry as default mode of operation on resize failure at pod level. On failure, to controller queues the failed pod for resize retry. The retries are triggered by events such as pods leaving a node (InPlaceOnly - node insufficient resources case) and a PDBUpdate (PDB violation failure case). The retry approach seems to fit with the k8s paradigm.

Handling resize requests when a resize operation is pending: queue requests and apply the discrete requests from queue one-by-one upon completion of inflight operation (success or failure)

Please let me know how this sounds, very likely I’m missing some details.

@DirectXMan12 @bskiba @derekwaynecarr

bsalamat · 2019-01-04T00:47:26Z

@kgolab Do you plan to move this KEP to the new repo? Also, as mentioned above, there are several parallel efforts/proposals trying to address the same problem. It would be best if you folks could join efforts and merge the ideas into a single KEP.

resouer · 2019-01-08T18:46:35Z

@bsalamat Yes, for instance we do have a patch of in-place update for latest Kubernetes. I will try to ping @kgolab to see if a join efforts KEP is possible.

kgolab · 2019-01-10T09:13:16Z

@bsalamat, @resouer, @vinaykul - yes I do plan to pick up this topic again, most likely early next week.
Sorry it's taking so long :(

vinaykul · 2019-01-10T21:52:04Z

@bsalamat, @resouer, @vinaykul - yes I do plan to pick up this topic again, most likely early next week.
Sorry it's taking so long :(

@kgolab I joined the SIG-node weekly meeting this Tuesday and brought this topic up. I mentioned, based on my chat with @bskiba and @DirectXMan12 during last KubeCon, that we were looking for sig-node to review and comment on the proposal. @bsalamat and I talked about this as well at KubeCon, and he has been very helpful and supportive of this effort.

@dchen1107 seemed open to it, and suggested that we move the draft KEP into the new process and create a starting point for them.

If you don't mind, I can move your draft-KEP over to k/enhancements/kep/sig-autoscaling. I have blocked out some time tomorrow to closely review the new KEP process and get this moved over, and send a PR.

Or would you prefer to drive it next week?

resouer · 2019-01-10T22:36:57Z

@vinaykul I think Karol is planning to start the new KEP considering all his efforts in this old one.

At the same time, not sure if it's possible for you and other folks interested in this feature draft a simple requirements/use cases/design doc from your own perspective?

vinaykul · 2019-01-10T23:48:47Z

@vinaykul I think Karol is planning to start the new KEP considering all his efforts in this old one.

At the same time, not sure if it's possible for you and other folks interested in this feature draft a simple requirements/use cases/design doc from your own perspective?

@resouer The above merged draft KEP builds on our design and the IBM proposal. Please review the following thread: https://groups.google.com/forum/#!topic/kubernetes-sig-scheduling/UnIhGOKpohI

The design doc linked in that thread documents our requirements and use case, and there are links to the implementation to try out as well. It would be great if you could review it and provide any feedback to help evolve the merged design.

resouer · 2019-01-11T05:41:31Z

@vinaykul That would be great! I am drafting a requirement doc from our side to help with the join effort KEP as well. Will share it next week.

kgolab · 2019-01-11T08:27:13Z

@vinaykul, if you have time to move the draft to new KEP process, that would be great. If not I'll pick it up as written earlier.

@resouer, please let us know once you have your requirements ready, it would be good to include them in the merged KEP so it's really a joint one.

vinaykul · 2019-01-12T02:41:14Z

@vinaykul, if you have time to move the draft to new KEP process, that would be great. If not I'll pick it up as written earlier.

@resouer, please let us know once you have your requirements ready, it would be good to include them in the merged KEP so it's really a joint one.

@kgolab I went through the KEP template & documentation in k/enhancements, and no changes were needed to your initial KEP besides changing the owning-sig to sig-autoscaling, and adding initial set of reviewers from sig-scheduling and sig-node where I think bulk of the code changes will occur. I'll fix it if sig-autoscaling is not the right owner. To get the ball rolling, I plan to join the weekly sig-node meeting next Tuesday and bring this up for review.

@resouer Please share your requirements and implementation as it can help evolve this to address all known use cases.

resouer · 2019-03-12T00:46:41Z

@vinaykul The requirements from my side has been included by our team member's comment here: kubernetes/enhancements#686 (comment)

vinaykul · 2019-03-12T05:10:06Z

@vinaykul The requirements from my side has been included by our team member's comment here: kubernetes/enhancements#686 (comment)

@resouer Yes I kept those requirements in mind while updating the KEP flow control. We can have policy guide whether initiating actor (e.g. job/deploy controller) will take no action (default), retry, reschedule - I'm yet to add this part to the KEP. Resizing ephemeral storage (while I've not scoped for it in the KEP) should be a doable extension, and currently defined policies may be valid choices for it as well.

As for not changing PodSpec, as discussed in that comment, it is a reasonable change. Annotations are a security concern as they can allow user to mess-up the system.

Add KEP skeleton & initial proposal for in-place update of Pod resources

4ad6fa7

k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Nov 6, 2018

k8s-ci-robot requested review from bgrant0607 and jdumars November 6, 2018 14:25

k8s-ci-robot added kind/kep sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. labels Nov 6, 2018

k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Nov 15, 2018

k8s-ci-robot added the sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. label Nov 16, 2018

k8s-ci-robot assigned bsalamat Nov 16, 2018

kgolab mentioned this pull request Nov 26, 2018

proposal for live and in-place vertical scaling #1719

Closed

k8s-ci-robot closed this Dec 1, 2018

vinaykul reviewed Dec 3, 2018

View reviewed changes

vinaykul mentioned this pull request Jan 12, 2019

KEP: in-place update of pod resources kubernetes/enhancements#686

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add KEP skeleton & initial proposal for in-place update of Pod resources #2908

Add KEP skeleton & initial proposal for in-place update of Pod resources #2908

kgolab commented Nov 6, 2018

k8s-ci-robot commented Nov 6, 2018

k8s-ci-robot commented Nov 6, 2018

thockin commented Nov 6, 2018

kgolab commented Nov 7, 2018

jdumars commented Nov 15, 2018

davidopp commented Nov 15, 2018

bsalamat commented Nov 16, 2018

bsalamat commented Nov 16, 2018

justaugustus commented Nov 20, 2018

justaugustus commented Dec 1, 2018

k8s-ci-robot commented Dec 1, 2018

vinaykul Dec 3, 2018

vinaykul Dec 13, 2018 •

edited

Loading

bsalamat commented Jan 4, 2019

resouer commented Jan 8, 2019

kgolab commented Jan 10, 2019

vinaykul commented Jan 10, 2019

resouer commented Jan 10, 2019

vinaykul commented Jan 10, 2019

resouer commented Jan 11, 2019 •

edited

Loading

kgolab commented Jan 11, 2019

vinaykul commented Jan 12, 2019

resouer commented Mar 12, 2019

vinaykul commented Mar 12, 2019 •

edited

Loading

Add KEP skeleton & initial proposal for in-place update of Pod resources #2908

Add KEP skeleton & initial proposal for in-place update of Pod resources #2908

Conversation

kgolab commented Nov 6, 2018

k8s-ci-robot commented Nov 6, 2018

k8s-ci-robot commented Nov 6, 2018

thockin commented Nov 6, 2018

kgolab commented Nov 7, 2018

jdumars commented Nov 15, 2018

davidopp commented Nov 15, 2018

bsalamat commented Nov 16, 2018

bsalamat commented Nov 16, 2018

justaugustus commented Nov 20, 2018

justaugustus commented Dec 1, 2018

k8s-ci-robot commented Dec 1, 2018

vinaykul Dec 3, 2018

Choose a reason for hiding this comment

vinaykul Dec 13, 2018 • edited Loading

Choose a reason for hiding this comment

bsalamat commented Jan 4, 2019

resouer commented Jan 8, 2019

kgolab commented Jan 10, 2019

vinaykul commented Jan 10, 2019

resouer commented Jan 10, 2019

vinaykul commented Jan 10, 2019

resouer commented Jan 11, 2019 • edited Loading

kgolab commented Jan 11, 2019

vinaykul commented Jan 12, 2019

resouer commented Mar 12, 2019

vinaykul commented Mar 12, 2019 • edited Loading

vinaykul Dec 13, 2018 •

edited

Loading

resouer commented Jan 11, 2019 •

edited

Loading

vinaykul commented Mar 12, 2019 •

edited

Loading