Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Carry #3205] libct/cg: add CFS bandwidth burst for CPU #3749

Merged
merged 1 commit into from
Sep 6, 2023

Conversation

Zheaoli
Copy link
Contributor

@Zheaoli Zheaoli commented Feb 23, 2023

Carry #3205

@Zheaoli Zheaoli marked this pull request as draft February 23, 2023 10:39
@Zheaoli
Copy link
Contributor Author

Zheaoli commented Mar 3, 2023

Seems the systemd is not support the CPU burst feature yet. systemd/systemd#26658

So should we keep pushing this PR or hold on it utill the feature has been supported in systemd @AkihiroSuda

@AkihiroSuda
Copy link
Member

Seems the systemd is not support the CPU burst feature yet. systemd/systemd#26658

So should we keep pushing this PR or hold on it utill the feature has been supported in systemd @AkihiroSuda

You can just let fs2 handle it

// Ignore the unknown resource here -- will still be
// applied in Set which calls fs2.Set.
logrus.Debugf("don't know how to convert unified resource %q=%q to systemd unit property; skipping (will still be applied to cgroupfs)", k, v)

@Zheaoli
Copy link
Contributor Author

Zheaoli commented Mar 4, 2023

Seems the systemd is not support the CPU burst feature yet. systemd/systemd#26658
So should we keep pushing this PR or hold on it utill the feature has been supported in systemd @AkihiroSuda

You can just let fs2 handle it

// Ignore the unknown resource here -- will still be
// applied in Set which calls fs2.Set.
logrus.Debugf("don't know how to convert unified resource %q=%q to systemd unit property; skipping (will still be applied to cgroupfs)", k, v)

Got it

@Zheaoli
Copy link
Contributor Author

Zheaoli commented Mar 10, 2023

Seems the systemd is not support the CPU burst feature yet. systemd/systemd#26658
So should we keep pushing this PR or hold on it utill the feature has been supported in systemd @AkihiroSuda

You can just let fs2 handle it

// Ignore the unknown resource here -- will still be
// applied in Set which calls fs2.Set.
logrus.Debugf("don't know how to convert unified resource %q=%q to systemd unit property; skipping (will still be applied to cgroupfs)", k, v)

But the burst feature supports both cv1 and cv2. hmmm I think maybe I can limit it should be only being used in non-systemd driver? @AkihiroSuda

@Zheaoli Zheaoli force-pushed the manjusaka/carry-burst branch 5 times, most recently from d3e5725 to 5c01766 Compare March 19, 2023 15:44
@Zheaoli
Copy link
Contributor Author

Zheaoli commented Mar 19, 2023

@AkihiroSuda Hi, I have written the test, but the rootless test failed. I have no idea what happened. would you mind take a look at for me ?

@Zheaoli
Copy link
Contributor Author

Zheaoli commented Mar 19, 2023

https://github.com/opencontainers/runc/actions/runs/4462135582/jobs/7836459757?pr=3749
https://github.com/opencontainers/runc/actions/runs/4462008272/jobs/7836249615?pr=3749

It seems strange why no directory specified for cpu.cfs_burst_us has been raised but not for other values like period and quota(both of them shared the same path

@Zheaoli Zheaoli marked this pull request as ready for review March 19, 2023 19:03
@Zheaoli
Copy link
Contributor Author

Zheaoli commented Mar 19, 2023

fine, I figured out the reason, the rootless CI has been passed

@Zheaoli Zheaoli force-pushed the manjusaka/carry-burst branch 2 times, most recently from 5f7c22c to b3833af Compare March 19, 2023 19:22
@Zheaoli
Copy link
Contributor Author

Zheaoli commented Mar 19, 2023

@Zheaoli
Copy link
Contributor Author

Zheaoli commented Mar 21, 2023

@AkihiroSuda ping(would you mind to give me some advise about test failed in Centos?

@AkihiroSuda
Copy link
Member

CI failing

@Zheaoli
Copy link
Contributor Author

Zheaoli commented Apr 27, 2023

@AkihiroSuda It seems like there may be some compatibility issues between CPU Burst and Centos7 and Centos8, which could be causing the CI to fail. Would your mind to give me some suggestion about how to pass all the CI?

@AkihiroSuda
Copy link
Member

It seems like there may be some compatibility issues between CPU Burst and Centos7 and Centos8, which could be causing the CI to fail. Would your mind to give me some suggestion about how to pass all the CI?

I guess you can just check existence of cpu.cfs_burst_us and skip the test when it does not exist?

@Zheaoli
Copy link
Contributor Author

Zheaoli commented Apr 27, 2023

It seems like there may be some compatibility issues between CPU Burst and Centos7 and Centos8, which could be causing the CI to fail. Would your mind to give me some suggestion about how to pass all the CI?

I guess you can just check existence of cpu.cfs_burst_us and skip the test when it does not exist?

Seems good, I'll make a try! Thanks a lot!

@lifubang
Copy link
Member

👍 Please use ‘git rebase main’ to catch up with the latest change, we should not have a commit named ‘merge main’. Thanks.

@Zheaoli
Copy link
Contributor Author

Zheaoli commented Aug 18, 2023

@lifubang Thanks for the review, all things done

@Zheaoli Zheaoli force-pushed the manjusaka/carry-burst branch 2 times, most recently from d1d66db to 7a152a3 Compare August 25, 2023 10:07
update.go Outdated Show resolved Hide resolved
libcontainer/cgroups/fs/cpu.go Outdated Show resolved Hide resolved
libcontainer/configs/cgroup_linux.go Outdated Show resolved Hide resolved
@lifubang
Copy link
Member

I'm so sorry to let you spend more time on this. We need to cover as many test cases as possible. ☕

@Zheaoli
Copy link
Contributor Author

Zheaoli commented Aug 26, 2023

We need to cover as many test cases as possible. ☕

I think we already have enough test cases here. May I ask what's kind of the corner case in your thought not covered by current test code?

@lifubang
Copy link
Member

May I ask what's kind of the corner case in your thought not covered by current test code?

For example, how to clear cpu-burst value in update after the container started with a cpu-burst value?

Copy link
Member

@lifubang lifubang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lifubang
Copy link
Member

Left some further works:

  1. No unit test for cpu controller in cgroupv2;
  2. There are some different results for cgroup v1 and v2, for example:
runc update --cpu-burst 30001 --cpu-quota 30000 test

v1: fail, v2: succ
Because for v2, the kernel write seconds to cpu.max.burst, but not milliseconds.
3. There is no integration test for cpu-burst when create/run a container.

We can do these works in further PRs.

@lifubang
Copy link
Member

@AkihiroSuda PTAL. I think this could be merged.

Burstable CFS controller is introduced in Linux 5.14. This helps with
parallel workloads that might be bursty. They can get throttled even
when their average utilization is under quota. And they may be latency
sensitive at the same time so that throttling them is undesired.

This feature borrows time now against the future underrun, at the cost
of increased interference against the other system users, by introducing
cfs_burst_us into CFS bandwidth control to enact the cap on unused
bandwidth accumulation, which will then used additionally for burst.

The patch adds the support/control for CFS bandwidth burst.

runtime-spec: opencontainers/runtime-spec#1120

Co-authored-by: Akihiro Suda <[email protected]>
Co-authored-by: Nadeshiko Manju <[email protected]>
Signed-off-by: Kailun Qin <[email protected]>
@AkihiroSuda AkihiroSuda merged commit 6b8a45d into opencontainers:main Sep 6, 2023
36 checks passed
@Zheaoli Zheaoli deleted the manjusaka/carry-burst branch September 9, 2023 13:57
@cyphar cyphar mentioned this pull request Mar 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants