Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update In-Place Pod Resize docs for v1.32 #48503

Merged
merged 2 commits into from
Nov 27, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion content/en/docs/concepts/workloads/autoscaling.md
Original file line number Diff line number Diff line change
Expand Up @@ -79,7 +79,7 @@ Mode | Description

#### Requirements for in-place resizing

{{< feature-state for_k8s_version="v1.27" state="alpha" >}}
{{< feature-state feature_gate_name="InPlacePodVerticalScaling" >}}
tallclair marked this conversation as resolved.
Show resolved Hide resolved

Resizing a workload in-place **without** restarting the {{< glossary_tooltip text="Pods" term_id="pod" >}}
or its {{< glossary_tooltip text="Containers" term_id="container" >}} requires Kubernetes version 1.27 or later.
Expand Down
2 changes: 1 addition & 1 deletion content/en/docs/reference/node/kubelet-files.md
Original file line number Diff line number Diff line change
Expand Up @@ -94,7 +94,7 @@ The name of a checkpoint file is `kubelet_internal_checkpoint` for [Device Manag
If your cluster has
[in-place Pod vertical scaling](/docs/concepts/workloads/autoscaling/#in-place-resizing)
enabled ([feature gate](/docs/reference/command-line-tools-reference/feature-gates/)
name `InPlacePodVerticalScaling`), then the kubelet stores a local record of Pod status.
name `InPlacePodVerticalScaling`), then the kubelet stores a local record of allocated Pod resources.

The file name is `pod_status_manager_state` within the kubelet base directory
(`/var/lib/kubelet` by default on Linux; configurable using `--root-dir`).
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -24,26 +24,33 @@ to be enabled. The alternative is to delete the Pod and let the
[workload controller](/docs/concepts/workloads/controllers/) make a replacement Pod
that has a different resource requirement.

A resize request is made through the pod `/resize` subresource, which takes the full updated pod for
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: use Pod, not pod throughout this diff.

an update request, or a patch on the pod object for a patch request.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
an update request, or a patch on the pod object for a patch request.
an update request, or a patch on the `Pod` object for a patch request.


For in-place resize of pod resources:
- Container's resource `requests` and `limits` are _mutable_ for CPU
and memory resources.
- `allocatedResources` field in `containerStatuses` of the Pod's status reflects
the resources allocated to the pod's containers.
- `resources` field in `containerStatuses` of the Pod's status reflects the
actual resource `requests` and `limits` that are configured on the running
containers as reported by the container runtime.
- `resize` field in the Pod's status shows the status of the last requested
- A container's resource `requests` and `limits` are _mutable_ for CPU
and memory resources. These fields represent the _desired_ resources for the container.
- The `resources` field in `containerStatuses` of the Pod's status reflects the resources
_allocated_ to the pod's containers. For running containers, this reflects the actual resource
`requests` and `limits` that are configured as reported by the container runtime. For non-running
containers, these are the resources allocated for the container when it starts.
Comment on lines +33 to +36
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- The `resources` field in `containerStatuses` of the Pod's status reflects the resources
_allocated_ to the pod's containers. For running containers, this reflects the actual resource
`requests` and `limits` that are configured as reported by the container runtime. For non-running
containers, these are the resources allocated for the container when it starts.
- The `containerStatuses.resources` field in the Pod `status` field reflects the resources
that are _allocated_ to the Pod's containers. For running containers, these are the actual values
of `requests` and `limits`, as reported by the container runtime. For non-running containers,
these are the resources that are allocated for when the container starts.

- The `resize` field in the Pod's status shows the status of the last requested
tallclair marked this conversation as resolved.
Show resolved Hide resolved
pending resize. It can have the following values:
- `Proposed`: This value indicates an acknowledgement of the requested resize
and that the request was validated and recorded.
- `Proposed`: This value indicates that a pod was resized, but the Kubelet has not yet processed
the resize.
Comment on lines +39 to +40
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- `Proposed`: This value indicates that a pod was resized, but the Kubelet has not yet processed
the resize.
- `Proposed`: this value indicates that a Pod resize request was received, but the
kubelet has not yet processed the resize.

- `InProgress`: This value indicates that the node has accepted the resize
request and is in the process of applying it to the pod's containers.
- `Deferred`: This value means that the requested resize cannot be granted at
this time, and the node will keep retrying. The resize may be granted when
other pods leave and free up node resources.
other pods are removed and free up node resources.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
other pods are removed and free up node resources.
other pods are removed and the available node resource capacity increases.

- `Infeasible`: is a signal that the node cannot accommodate the requested
resize. This can happen if the requested resize exceeds the maximum
resources the node can ever allocate for a pod.
- `""`: An empty or unset value indicates that the last resize completed. This should only be the
case if the resources in the container spec match the resources in the container status.
Comment on lines +49 to +50
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- `""`: An empty or unset value indicates that the last resize completed. This should only be the
case if the resources in the container spec match the resources in the container status.
- `""`: An empty or unset value indicates that the most recent pending resize was completed.
An empty or unset value only appears if the resources in the container specification match
the resources in the container status.


If a node has pods with an incomplete resize, the scheduler will compute the pod requests from the
maximum of a container's desired resource requests, and it's actual requests reported in the status.
Comment on lines +52 to +53
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
If a node has pods with an incomplete resize, the scheduler will compute the pod requests from the
maximum of a container's desired resource requests, and it's actual requests reported in the status.
If a node has Pods with an incomplete resize, the scheduler uses the largest of the following values to
calculate the Pod requests:
* The desired resource requests for all containers in the Pod, as specified in the
`spec.containers.resources.requests` field.
* The actual resource requests for all containers in the Pod, as reported in the
`spec.status.containerStatuses.resources.requests` field.



## {{% heading "prerequisites" %}}
Expand Down Expand Up @@ -107,6 +114,21 @@ have changed, the container will be restarted in order to resize its memory.

<!-- steps -->

## Limitations
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This belongs further up. Make this either a subheading of the prerequisites heading or a main heading before the prerequisites heading.


In-place resize of pod resources currently has the following limitations:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
In-place resize of pod resources currently has the following limitations:
In-place resize of Pod resources has the following limitations:


- Only CPU and memory resources can be changed.
- Pod QoS Class cannot change. This means that requests must continue to equal limits for Guaranteed
pods, Burstable pods cannot set requests and limits to be equal for both CPU & memory, and you
cannot add resource requirements to Best Effort pods.
Comment on lines +122 to +124
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Pod QoS Class cannot change. This means that requests must continue to equal limits for Guaranteed
pods, Burstable pods cannot set requests and limits to be equal for both CPU & memory, and you
cannot add resource requirements to Best Effort pods.
- The Pod QoS Class cannot change. The following considerations continue to apply:
- Guaranteed QoS Pods: requests must be equal to limits.
- Burstable QoS Pods: requests can't be equal to limits for both CPU and memory
- Best Effort QoS Pods: you can't set resource requirements

- Init containers and Ephemeral Containers cannot be resized.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Init containers and Ephemeral Containers cannot be resized.
- Init containers and ephemeral containers cannot be resized.

- Resource requests and limits cannot be removed once set.
- A container's memory limit may not be reduced below its usage. If a request puts a container in
this state, the resize status will remain in `InProgress` until the desired memory limit becomes
feasible.
Comment on lines +127 to +129
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- A container's memory limit may not be reduced below its usage. If a request puts a container in
this state, the resize status will remain in `InProgress` until the desired memory limit becomes
feasible.
- A container's memory limit can't be reduced below the memory usage. If a request puts a
container in this state, the resize will remain in the `InProgress` state until the desired
memory limit becomes feasible.

- Windows pods cannot be resized.


## Create a pod with resource requests and limits

Expand Down Expand Up @@ -159,9 +181,6 @@ spec:
name: qos-demo-ctr-5
ready: true
...
allocatedResources:
cpu: 700m
memory: 200Mi
resources:
limits:
cpu: 700m
Expand Down Expand Up @@ -190,7 +209,7 @@ resources, you cannot change the QoS class in which the Pod was created.
Now, patch the Pod's Container with CPU requests & limits both set to `800m`:

```shell
kubectl -n qos-example patch pod qos-demo-5 --patch '{"spec":{"containers":[{"name":"qos-demo-ctr-5", "resources":{"requests":{"cpu":"800m"}, "limits":{"cpu":"800m"}}}]}}'
kubectl -n qos-example patch pod qos-demo-5 --subresource resize --patch '{"spec":{"containers":[{"name":"qos-demo-ctr-5", "resources":{"requests":{"cpu":"800m"}, "limits":{"cpu":"800m"}}}]}}'
```

Query the Pod's detailed information after the Pod has been patched.
Expand All @@ -215,9 +234,6 @@ spec:
...
containerStatuses:
...
allocatedResources:
cpu: 800m
memory: 200Mi
resources:
limits:
cpu: 800m
Expand All @@ -229,12 +245,9 @@ spec:
started: true
```

Observe that the `allocatedResources` values have been updated to reflect the new
desired CPU requests. This indicates that node was able to accommodate the
increased CPU resource needs.

In the Container's status, updated CPU resource values shows that new CPU
resources have been applied. The Container's `restartCount` remains unchanged,
Observe that the `resources` in the `containerStatuses` have been updated to reflect the new desired
CPU requests. This indicates that node was able to accommodate the increased CPU resource needs,
and the new CPU resources have been applied. The Container's `restartCount` remains unchanged,
indicating that container's CPU resources were resized without restarting the container.


Expand Down