Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

neonvm: introduce CPU sysfs state scaling flow based on the vmSpec.cpuScalingMode #1111

Merged
merged 18 commits into from
Nov 27, 2024

Conversation

mikhail-sakhnov
Copy link
Contributor

@mikhail-sakhnov mikhail-sakhnov commented Oct 14, 2024

Introduce separate CPU scaling flow based on the vmSpec.cpuScalingMode

If vmSpec.cpuScalingMode is equal to qmpScaling the logic of the
scaling is preserved as before:

  • Scale, if required the amount of CPUs using qmp commands.
  • If it is required to scale cgroups, call vm-runner /cpu_change endpoint

if vmSpec.cpuScalingMode is equal to sysfsScaling all cpu
scaling requests go directly to the vm-runner /cpu_change, which in
that configuration goes to the neonvm-daemon to reconcile required
amount of online CPUs.

Value cpuSysfsStateScaling also modifies the qemu and the kernel
arguments to enable plug all CPUs but mark as online only first one.

@mikhail-sakhnov mikhail-sakhnov force-pushed the misha/cpu-scaling branch 4 times, most recently from 85608dd to dfe79f1 Compare October 14, 2024 13:47
@mikhail-sakhnov mikhail-sakhnov changed the title Misha/cpu scaling neonvm-controller, neonvm-runner, neonvm-daemon: introduce CPU sysfs state scaling flow based on the vmSpec.cpuScalingMode Oct 14, 2024
@mikhail-sakhnov mikhail-sakhnov changed the title neonvm-controller, neonvm-runner, neonvm-daemon: introduce CPU sysfs state scaling flow based on the vmSpec.cpuScalingMode neonvm: introduce CPU sysfs state scaling flow based on the vmSpec.cpuScalingMode Oct 14, 2024
@mikhail-sakhnov mikhail-sakhnov force-pushed the misha/cpu-scaling branch 3 times, most recently from 94bdcba to 002f978 Compare October 14, 2024 13:57
@mikhail-sakhnov mikhail-sakhnov force-pushed the misha/cpu-scaling branch 3 times, most recently from 508920e to e48f902 Compare October 14, 2024 18:32
@mikhail-sakhnov mikhail-sakhnov marked this pull request as ready for review October 14, 2024 18:36
@sharnoff sharnoff self-assigned this Oct 14, 2024
@mikhail-sakhnov
Copy link
Contributor Author

FYI, @sharnoff
We discussed with @Omrigan that it would be great to reflect the currently used cpuScalingMode in status or annotations for better observability plus probably we want to restart pod if the cpuScalingMode was changed - that actually would allow us to perform gradual testing manually without switching default mode.

Copy link
Member

@sharnoff sharnoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Partial review, haven't yet gotten to the neonvm-controller changes

neonvm-controller/cmd/main.go Outdated Show resolved Hide resolved
Makefile Outdated Show resolved Hide resolved
neonvm-controller/deployment.yaml Outdated Show resolved Hide resolved
neonvm-daemon/pkg/cpuscaling/sys_fs_state_scaling.go Outdated Show resolved Hide resolved
neonvm-daemon/pkg/cpuscaling/sys_fs_state_scaling.go Outdated Show resolved Hide resolved
neonvm-runner/cmd/main.go Outdated Show resolved Hide resolved
neonvm-runner/cmd/main.go Outdated Show resolved Hide resolved
neonvm-runner/cmd/main.go Outdated Show resolved Hide resolved
neonvm/apis/neonvm/v1/virtualmachine_types.go Show resolved Hide resolved
pkg/api/types.go Outdated Show resolved Hide resolved
Copy link
Contributor

@Omrigan Omrigan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot of suggestions around naming, and other.

neonvm-controller/cmd/main.go Outdated Show resolved Hide resolved
neonvm/apis/neonvm/v1/virtualmachine_types.go Outdated Show resolved Hide resolved
pkg/neonvm/controllers/vm_controller.go Outdated Show resolved Hide resolved
pkg/neonvm/controllers/vm_controller.go Outdated Show resolved Hide resolved
pkg/neonvm/controllers/vm_controller_handle_cpu_scaling.go Outdated Show resolved Hide resolved
pkg/neonvm/cpuscaling/sysfsscaling.go Outdated Show resolved Hide resolved
neonvm-daemon/pkg/cpuscaling/sys_fs_state_scaling.go Outdated Show resolved Hide resolved
neonvm-runner/cmd/main.go Outdated Show resolved Hide resolved
neonvm-daemon/cmd/main.go Show resolved Hide resolved
neonvm-runner/cmd/main.go Show resolved Hide resolved
Copy link
Member

@sharnoff sharnoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! 🚀

Left just a couple final comments

pkg/neonvm/cpuscaling/cpuscaler.go Outdated Show resolved Hide resolved
pkg/neonvm/cpuscaling/sysfs.go Outdated Show resolved Hide resolved
@mikhail-sakhnov mikhail-sakhnov force-pushed the misha/cpu-scaling branch 2 times, most recently from d0af69f to fe10eef Compare November 20, 2024 16:03
@sharnoff sharnoff assigned mikhail-sakhnov and unassigned sharnoff Nov 21, 2024
mikhail-sakhnov and others added 16 commits November 26, 2024 13:12
…xplicitly default the field value if it is not set

Signed-off-by: Misha Sakhnov <[email protected]>
…nd mark vmSpec.Guest.CPUs.Min as online during the boot

Signed-off-by: Misha Sakhnov <[email protected]>
introduce separate CPU scaling flow based on the vmSpec.cpuScalingMode

If vmSpec.cpuScalingMode is equal to `qmp_scaling` the logic of the
scaling is preserved as before:

- Scale, if required the amount of CPUs using qmp commands.
- If it is required to scale cgroups, call vm-runner /cpu_change endpoint

if vmSpec.cpuScalingMode is equal to `cpuSysfsStateScaling` all cpu
scaling requests go directly to the vm-runner /cpu_change, which in
that configuration goes to the neonvm-daemon to reconcile required
amount of online CPUs.

Value `cpuSysfsStateScaling` also modifies the qemu and the kernel
arguments to enable plug all CPUs but mark as online only first one.

Signed-off-by: Misha Sakhnov <[email protected]>
pass cpuScalingMode as argument to the vm-runner
rename arguments, constants and functions here and there
drop unused code
move default cpu scaling mode to controller argument

Signed-off-by: Misha Sakhnov <[email protected]>
change CpuScalingMode to be a enum type
drop obsolete comments

Signed-off-by: Misha Sakhnov <[email protected]>
Improve readability in pkg/neonvm/controllers/vm_controller_cpu_scaling.go
by adding empty lines and comments

Signed-off-by: Misha Sakhnov <[email protected]>
Change scaling logic to work with ranges instead of slices
Split sys fs scaling struture to allow unit testing
Add unit tests for the main reconcilation logic

Signed-off-by: Misha Sakhnov <[email protected]>
Get current state through aggregated /cpu/online file
instead of using per-cpu files

Signed-off-by: Misha Sakhnov <[email protected]>
Co-authored-by: Em Sharnoff <[email protected]>
Signed-off-by: Misha Sakhnov <[email protected]>
Change cpuscaling logic to work with collection of IDs

Signed-off-by: Misha Sakhnov <[email protected]>
Copy link
Contributor

@Omrigan Omrigan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, one suggestion.

neonvm-runner/cmd/main.go Outdated Show resolved Hide resolved
@Omrigan Omrigan enabled auto-merge (squash) November 27, 2024 11:19
@Omrigan Omrigan merged commit 0bc5fe1 into main Nov 27, 2024
22 checks passed
@Omrigan Omrigan deleted the misha/cpu-scaling branch November 27, 2024 14:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants