Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FG:InPlacePodVerticalScaling] Add UpdatePodSandboxResources CRI method #128123

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

felipeagger
Copy link

@felipeagger felipeagger commented Oct 16, 2024

What type of PR is this?

/kind feature

What this PR does / why we need it:

As of Kubernetes v1.20, the CRI has included support for in-place resizing of containers via the UpdateContainerResources API, which is implemented by both containerd and CRI-O. Additionally, the ContainerStatus message includes a ContainerResources field, which reports the current resource configuration of the container.

Even though pod-level cgroups are currently managed by the Kubelet, runtimes may rely need to be notified when the resource configuration changes.

Doc: https://github.com/kubernetes/enhancements/blob/master/keps/sig-node/1287-in-place-update-pod-resources/README.md#cri-changes

Which issue(s) this PR fixes:

Fixes #128069

Special notes for your reviewer:

Does this PR introduce a user-facing change?

NONE

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. kind/feature Categorizes issue or PR as related to a new feature. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Oct 16, 2024
@k8s-ci-robot
Copy link
Contributor

This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Oct 16, 2024
@k8s-ci-robot
Copy link
Contributor

Hi @felipeagger. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the needs-priority Indicates a PR lacks a `priority/foo` label and requires one. label Oct 16, 2024
@k8s-ci-robot k8s-ci-robot added area/kubelet sig/node Categorizes an issue or PR as relevant to SIG Node. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Oct 16, 2024
@felipeagger felipeagger force-pushed the feat/add-updatepodsandbox-cri-method branch 3 times, most recently from eb28f8b to 7fd7068 Compare October 16, 2024 12:18
@felipeagger
Copy link
Author

@Karthik-K-N Where in the Kubelet will call UpdatePodSandboxResources after it has reconfigured the pod-level cgroups ?
Can you send me the function/file name?

@Karthik-K-N
Copy link
Contributor

@Karthik-K-N Where in the Kubelet will call UpdatePodSandboxResources after it has reconfigured the pod-level cgroups ? Can you send me the function/file name?

I am not sure either, Need to explore, If I find I will update here. Thank you.

@felipeagger felipeagger force-pushed the feat/add-updatepodsandbox-cri-method branch from 7fd7068 to 857c205 Compare October 16, 2024 16:09
@felipeagger
Copy link
Author

felipeagger commented Oct 16, 2024

/cc @vinaykul
Can you help us please?

Where in the Kubelet will call UpdatePodSandboxResources after it has reconfigured the pod-level cgroups ?

inside kubeGenericRuntimeManager.createPodSandbox?

@k8s-ci-robot k8s-ci-robot requested a review from vinaykul October 16, 2024 16:15
@vinaykul
Copy link
Member

#128069

Please see:

// Memory and CPU are updated separately because memory resizes may be ordered differently than CPU resizes.
// If resize results in net pod resource increase, set pod cgroup config before resizing containers.
// If resize results in net pod resource decrease, set pod cgroup config after resizing containers.
// If an error occurs at any point, abort. Let future syncpod iterations retry the unfinished stuff.
resizeContainers := func(rName v1.ResourceName, currPodCgLimValue, newPodCgLimValue, currPodCgReqValue, newPodCgReqValue int64) error {
var err error
if newPodCgLimValue > currPodCgLimValue {
if err = setPodCgroupConfig(rName, true); err != nil {
return err
}
}
if newPodCgReqValue > currPodCgReqValue {
if err = setPodCgroupConfig(rName, false); err != nil {
return err
}
}
if len(podContainerChanges.ContainersToUpdate[rName]) > 0 {
if err = m.updatePodContainerResources(pod, rName, podContainerChanges.ContainersToUpdate[rName]); err != nil {
klog.ErrorS(err, "updatePodContainerResources failed", "pod", format.Pod(pod), "resource", rName)
return err
}
}
if newPodCgLimValue < currPodCgLimValue {
err = setPodCgroupConfig(rName, true)
}
if newPodCgReqValue < currPodCgReqValue {
if err = setPodCgroupConfig(rName, false); err != nil {
return err
}
}
return err
}

/cc @tallclair

@vinaykul
Copy link
Member

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Oct 17, 2024
@k8s-ci-robot k8s-ci-robot added sig/windows Categorizes an issue or PR as relevant to SIG Windows. wg/device-management Categorizes an issue or PR as relevant to WG Device Management. do-not-merge/contains-merge-commits Indicates a PR which contains merge commits. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Nov 6, 2024
@felipeagger felipeagger force-pushed the feat/add-updatepodsandbox-cri-method branch from 5698a36 to 8d66773 Compare November 6, 2024 11:32
@k8s-ci-robot k8s-ci-robot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. and removed do-not-merge/contains-merge-commits Indicates a PR which contains merge commits. labels Nov 6, 2024
@felipeagger felipeagger closed this Nov 6, 2024
@felipeagger felipeagger force-pushed the feat/add-updatepodsandbox-cri-method branch from 8d66773 to 50d0f92 Compare November 6, 2024 11:39
@k8s-ci-robot k8s-ci-robot added size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. and removed size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Nov 6, 2024
@felipeagger felipeagger reopened this Nov 6, 2024
@k8s-ci-robot k8s-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Nov 6, 2024
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: felipeagger
Once this PR has been reviewed and has the lgtm label, please ask for approval from tallclair. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@felipeagger felipeagger force-pushed the feat/add-updatepodsandbox-cri-method branch from 6d9a16d to e9214cb Compare November 6, 2024 12:26
@felipeagger
Copy link
Author

@tallclair I made the adjusts, can you check? pls

@@ -210,6 +210,15 @@ func (m *kubeGenericRuntimeManager) generateContainerResources(pod *v1.Pod, cont
}
}

// generateUpdatePodSandboxResourcesRequest generates platform specific (linux) podsandox resources config for runtime
func (m *kubeGenericRuntimeManager) generateUpdatePodSandboxResourcesRequest(sandboxID string, pod *v1.Pod) *runtimeapi.UpdatePodSandboxResourcesRequest {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tallclair This function cannot be inline because the functions convertOverheadToLinuxResources and calculateSandboxResources are platform specific, and break on typechecker.

@k8s-ci-robot
Copy link
Contributor

k8s-ci-robot commented Nov 6, 2024

@felipeagger: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-kubernetes-e2e-gce-cos-alpha-features 8d66773 link false /test pull-kubernetes-e2e-gce-cos-alpha-features
pull-kubernetes-kind-dra-all 8d66773 link false /test pull-kubernetes-kind-dra-all
pull-kubernetes-node-e2e-crio-cgrpv1-dra 8d66773 link false /test pull-kubernetes-node-e2e-crio-cgrpv1-dra
pull-kubernetes-kind-dra 8d66773 link false /test pull-kubernetes-kind-dra
pull-kubernetes-e2e-gci-gce-ingress 8d66773 link false /test pull-kubernetes-e2e-gci-gce-ingress
pull-kubernetes-node-e2e-crio-cgrpv2-dra-kubetest2 8d66773 link false /test pull-kubernetes-node-e2e-crio-cgrpv2-dra-kubetest2
pull-kubernetes-e2e-kind-kms 8d66773 link false /test pull-kubernetes-e2e-kind-kms
pull-kubernetes-conformance-image-test 8d66773 link false /test pull-kubernetes-conformance-image-test
pull-kubernetes-e2e-kind-ipvs 8d66773 link false /test pull-kubernetes-e2e-kind-ipvs
pull-kubernetes-e2e-gce-network-policies 8d66773 link false /test pull-kubernetes-e2e-gce-network-policies
pull-kubernetes-local-e2e 8d66773 link false /test pull-kubernetes-local-e2e
pull-kubernetes-node-e2e-containerd-1-7-dra 8d66773 link false /test pull-kubernetes-node-e2e-containerd-1-7-dra
pull-kubernetes-node-e2e-crio-cgrpv1-dra-kubetest2 8d66773 link false /test pull-kubernetes-node-e2e-crio-cgrpv1-dra-kubetest2
pull-kubernetes-e2e-gci-gce-ipvs 8d66773 link false /test pull-kubernetes-e2e-gci-gce-ipvs
pull-kubernetes-e2e-kind-nftables 8d66773 link false /test pull-kubernetes-e2e-kind-nftables
pull-kubernetes-node-e2e-crio-cgrpv2-dra 8d66773 link false /test pull-kubernetes-node-e2e-crio-cgrpv2-dra
pull-kubernetes-e2e-gce-storage-slow 8d66773 link false /test pull-kubernetes-e2e-gce-storage-slow
pull-kubernetes-cross 8d66773 link false /test pull-kubernetes-cross
check-dependency-stats 8d66773 link false /test check-dependency-stats
pull-kubernetes-e2e-gce-storage-snapshot 8d66773 link false /test pull-kubernetes-e2e-gce-storage-snapshot
pull-publishing-bot-validate 8d66773 link false /test pull-publishing-bot-validate
pull-kubernetes-e2e-storage-kind-disruptive 8d66773 link false /test pull-kubernetes-e2e-storage-kind-disruptive
pull-kubernetes-e2e-gce-csi-serial 8d66773 link false /test pull-kubernetes-e2e-gce-csi-serial
pull-kubernetes-e2e-capz-windows-master 27bfa7e link false /test pull-kubernetes-e2e-capz-windows-master

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@k8s-triage-robot
Copy link

This PR may require API review.

If so, when the changes are ready, complete the pre-review checklist and request an API review.

Status of requested reviews is tracked in the API Review project.

@fedebongio
Copy link
Contributor

/remove-sig api-machinery

@k8s-ci-robot k8s-ci-robot removed the sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. label Nov 7, 2024
@pohly
Copy link
Contributor

pohly commented Nov 19, 2024

/remove-wg device-management

@k8s-ci-robot k8s-ci-robot removed the wg/device-management Categorizes an issue or PR as relevant to WG Device Management. label Nov 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/apiserver area/cloudprovider area/code-generation area/conformance Issues or PRs related to kubernetes conformance tests area/dependency Issues or PRs related to dependency changes area/e2e-test-framework Issues or PRs related to refactoring the kubernetes e2e test framework area/ipvs area/kube-proxy area/kubeadm area/kubectl area/kubelet area/provider/gcp Issues or PRs related to gcp provider area/release-eng Issues or PRs related to the Release Engineering subproject area/test cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API kind/feature Categorizes issue or PR as related to a new feature. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. release-note-none Denotes a PR that doesn't merit a release note. sig/apps Categorizes an issue or PR as relevant to SIG Apps. sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. sig/auth Categorizes an issue or PR as relevant to SIG Auth. sig/cli Categorizes an issue or PR as relevant to SIG CLI. sig/cloud-provider Categorizes an issue or PR as relevant to SIG Cloud Provider. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. sig/etcd Categorizes an issue or PR as relevant to SIG Etcd. sig/instrumentation Categorizes an issue or PR as relevant to SIG Instrumentation. sig/network Categorizes an issue or PR as relevant to SIG Network. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/release Categorizes an issue or PR as relevant to SIG Release. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. sig/storage Categorizes an issue or PR as relevant to SIG Storage. sig/testing Categorizes an issue or PR as relevant to SIG Testing. sig/windows Categorizes an issue or PR as relevant to SIG Windows. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
Status: In Progress
Status: !SIG Auth
Status: In Progress
Status: Archive-it
Status: No status
Development

Successfully merging this pull request may close these issues.

[FG:InPlacePodVerticalScaling] Add UpdatePodSandboxResources CRI method
10 participants