Skip to content

Commit

Permalink
In-Place Pod Reszie Beta
Browse files Browse the repository at this point in the history
  • Loading branch information
tallclair committed Nov 19, 2024
1 parent e3f0368 commit a09eb44
Show file tree
Hide file tree
Showing 3 changed files with 37 additions and 26 deletions.
2 changes: 1 addition & 1 deletion content/en/docs/concepts/workloads/autoscaling.md
Original file line number Diff line number Diff line change
Expand Up @@ -79,7 +79,7 @@ Mode | Description

#### Requirements for in-place resizing

{{< feature-state for_k8s_version="v1.27" state="alpha" >}}
{{< feature-state feature_gate_name="InPlacePodVerticalScaling" >}}

Resizing a workload in-place **without** restarting the {{< glossary_tooltip text="Pods" term_id="pod" >}}
or its {{< glossary_tooltip text="Containers" term_id="container" >}} requires Kubernetes version 1.27 or later.
Expand Down
2 changes: 1 addition & 1 deletion content/en/docs/reference/node/kubelet-files.md
Original file line number Diff line number Diff line change
Expand Up @@ -94,7 +94,7 @@ The name of a checkpoint file is `kubelet_internal_checkpoint` for [Device Manag
If your cluster has
[in-place Pod vertical scaling](/docs/concepts/workloads/autoscaling/#in-place-resizing)
enabled ([feature gate](/docs/reference/command-line-tools-reference/feature-gates/)
name `InPlacePodVerticalScaling`), then the kubelet stores a local record of Pod status.
name `InPlacePodVerticalScaling`), then the kubelet stores a local record of allocated Pod resources.

The file name is `pod_status_manager_state` within the kubelet base directory
(`/var/lib/kubelet` by default on Linux; configurable using `--root-dir`).
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -24,27 +24,32 @@ to be enabled. The alternative is to delete the Pod and let the
[workload controller](/docs/concepts/workloads/controllers/) make a replacement Pod
that has a different resource requirement.

A resize request is made through the pod `/resize` subresource, which takes the full updated pod for
an update request, or a patch on the pod object for a patch request.

For in-place resize of pod resources:
- Container's resource `requests` and `limits` are _mutable_ for CPU
and memory resources.
- `allocatedResources` field in `containerStatuses` of the Pod's status reflects
the resources allocated to the pod's containers.
- `resources` field in `containerStatuses` of the Pod's status reflects the
actual resource `requests` and `limits` that are configured on the running
containers as reported by the container runtime.
- `resize` field in the Pod's status shows the status of the last requested
- A container's resource `requests` and `limits` are _mutable_ for CPU
and memory resources. These fields represent the _desired_ resources for the container.
- The `resources` field in `containerStatuses` of the Pod's status reflects the resources
_allocated_ to the pod's containers. For running containers, this reflects the actual resource
`requests` and `limits` that are configured as reported by the container runtime. For non-running
containers, these are the resources allocated for the container when it starts.
- The `resize` field in the Pod's status shows the status of the last requested
pending resize. It can have the following values:
- `Proposed`: This value indicates an acknowledgement of the requested resize
and that the request was validated and recorded.
- `Proposed`: This value indicates that a pod was resized, but the Kubelet has not yet processed
the resize.
- `InProgress`: This value indicates that the node has accepted the resize
request and is in the process of applying it to the pod's containers.
- `Deferred`: This value means that the requested resize cannot be granted at
this time, and the node will keep retrying. The resize may be granted when
other pods leave and free up node resources.
other pods are removed and free up node resources.
- `Infeasible`: is a signal that the node cannot accommodate the requested
resize. This can happen if the requested resize exceeds the maximum
resources the node can ever allocate for a pod.

If a node has pods with an incomplete resize, the scheduler will compute the pod requests from the
maximum of a container's desired resource requests, and it's actual requests reported in the status.


## {{% heading "prerequisites" %}}

Expand Down Expand Up @@ -107,6 +112,21 @@ have changed, the container will be restarted in order to resize its memory.
<!-- steps -->
## Limitations
In-place resize of pod resources currently has the following limitations:
- Only CPU and memory resources can be changed.
- Pod QoS Class cannot change. This means that requests must continue to equal limits for Guaranteed
pods, Burstable pods cannot set requests and limits to be equal for both CPU & memory, and you
cannot add resource requirements to Best Effort pods.
- Init containers and Ephemeral Containers cannot be resized.
- Resource requests and limits cannot be removed once set.
- A container's memory limit may not be reduced below its usage. If a request puts a container in
this state, the resize status will remain in `InProgress` until the desired memory limit becomes
feasible.
- Windows pods cannot be resized.


## Create a pod with resource requests and limits

Expand Down Expand Up @@ -159,9 +179,6 @@ spec:
name: qos-demo-ctr-5
ready: true
...
allocatedResources:
cpu: 700m
memory: 200Mi
resources:
limits:
cpu: 700m
Expand Down Expand Up @@ -190,7 +207,7 @@ resources, you cannot change the QoS class in which the Pod was created.
Now, patch the Pod's Container with CPU requests & limits both set to `800m`:

```shell
kubectl -n qos-example patch pod qos-demo-5 --patch '{"spec":{"containers":[{"name":"qos-demo-ctr-5", "resources":{"requests":{"cpu":"800m"}, "limits":{"cpu":"800m"}}}]}}'
kubectl -n qos-example patch pod qos-demo-5 --subresource resize --patch '{"spec":{"containers":[{"name":"qos-demo-ctr-5", "resources":{"requests":{"cpu":"800m"}, "limits":{"cpu":"800m"}}}]}}'
```

Query the Pod's detailed information after the Pod has been patched.
Expand All @@ -215,9 +232,6 @@ spec:
...
containerStatuses:
...
allocatedResources:
cpu: 800m
memory: 200Mi
resources:
limits:
cpu: 800m
Expand All @@ -229,12 +243,9 @@ spec:
started: true
```

Observe that the `allocatedResources` values have been updated to reflect the new
desired CPU requests. This indicates that node was able to accommodate the
increased CPU resource needs.

In the Container's status, updated CPU resource values shows that new CPU
resources have been applied. The Container's `restartCount` remains unchanged,
Observe that the `resources` in the `containerStatuses` have been updated to reflect the new desired
CPU requests. This indicates that node was able to accommodate the increased CPU resource needs.c,
and the new CPU resources have been applied. The Container's `restartCount` remains unchanged,
indicating that container's CPU resources were resized without restarting the container.


Expand Down

0 comments on commit a09eb44

Please sign in to comment.