From 29d95bc48585400cb0dbeaebe237ea030ae83a01 Mon Sep 17 00:00:00 2001 From: Vinay Kulkarni Date: Mon, 28 Oct 2019 16:26:00 -0700 Subject: [PATCH] Address key open items and move KEP to implementable state --- ...181106-in-place-update-of-pod-resources.md | 21 +++++++++++++++++-- 1 file changed, 19 insertions(+), 2 deletions(-) diff --git a/keps/sig-autoscaling/20181106-in-place-update-of-pod-resources.md b/keps/sig-autoscaling/20181106-in-place-update-of-pod-resources.md index d50c42a734c8..8e9a17bdda9c 100644 --- a/keps/sig-autoscaling/20181106-in-place-update-of-pod-resources.md +++ b/keps/sig-autoscaling/20181106-in-place-update-of-pod-resources.md @@ -23,8 +23,8 @@ approvers: - "@mwielgus" editor: TBD creation-date: 2018-11-06 -last-updated: 2018-11-06 -status: provisional +last-updated: 2019-10-25 +status: implementable see-also: replaces: superseded-by: @@ -49,6 +49,7 @@ superseded-by: * [Scheduler and API Server interaction](#scheduler-and-api-server-interaction) * [Flow Control](#flow-control) * [Container resource limit update ordering](#container-resource-limit-update-ordering) + * [Container resource limit update failure handling](#container-resource-limit-update-failure-handling) * [Notes](#notes) * [Affected Components](#affected-components) * [Future Enhancements](#future-enhancements) @@ -167,6 +168,12 @@ Kubelet calls UpdateContainerResources CRI API which currently takes but not for Windows. This parameter changes to *runtimeapi.ContainerResources*, that is runtime agnostic, and will contain platform-specific information. +Additionally, GetContainerResources CRI API is introduced that allows Kubelet +to query currently configured CPU and memory limits for a container. + +These CRI changes are a separate effort that does not affect the design +proposed in this KEP. + ### Kubelet and API Server Interaction When a new Pod is created, Scheduler is responsible for selecting a suitable @@ -283,6 +290,16 @@ updates resource limit for the Pod and its Containers in the following manner: In all the above cases, Kubelet applies Container resource limit decreases before applying limit increases. +#### Container resource limit update failure handling + +If multiple Containers in a Pod are being updated, and UpdateContainerResources +CRI API fails for any of the containers, Kubelet will backoff and retry at a +later time. Kubelet does not attempt to update limits for containers that are +lined up for update after the failing container. This ensures that sum of the +container limits does not exceed Pod-level cgroup limit at any point. Once all +the container limits have been successfully updated, Kubelet updates the Pod's +Status.ContainerStatuses[i].Resources to match the desired limit values. + #### Notes * If CPU Manager policy for a Node is set to 'static', then only integral