By default, libvirt runs QEMU with a CPU model that doesn't support nested virtualization. It's possible to change this behavior by using VirtletCPUModel: host-model
annotation in the pod definition. You can also use cpuModel
value in Virtlet config to override the value globally for the cluster or for a particular subset of nodes.
If you are familiar with the cpu part in libvirt domain definition, you can use VirtletLibvirtCPUSetting
annotation, the value is directly passed to libvirt after translating form yaml style. It is more flexible than usage of VirtletCPUModel as it allows to provide more detailed configuration.
For example:
annotations:
VirtletLibvirtCPUSetting: |
mode: custom
model:
value: Westmere
features:
- name: avx
policy: disable
See cpuSetting for a full example.
As Kubelet uses cAdvisor to collect metrics about running containers and Virtlet doesn't create container per each VM, and instead spawns VMs inside Virtlet container. This leads to all the resource usage being lumped together and ascribed to Virtlet pod.
shares
- relative value of cpu time assigned, not recommended for using in production as it's hard to predict the actual performance which highly depends on the neighboring cgroups.CFS CPU bandwidth control
- period and quota - hard limits.Parent_Period/Quota <= Child_1_Period/Quota + .. + Child_N_Period/Quota
, whereChild_N_Period/Quota <= Parent_Period/Quota
.
shares
are set per container.CFS CPU bandwidth control
- period and quota - are set per container.
Defaults: In absence of explicitly set values each container has 2 shares set by default.
shares
is set per each vCPU.period
andquota
are set per each vCPU. As libvirt imposes limits per each vCPU thread, so actualCPU quota
isquota
value from the domain definition times the number of vCPUs. More details re reasons of libvirt per vCPU cgroup approach can be found at https://www.redhat.com/archives/libvir-list/2015-June/msg00923.html.emulator_period
andemulator_quota
denote the limits for emulator threads(those excluding vcpus). At the same time for unlimited domains benchmarks show that these activities may measure up to 40-80% of overall physical CPU usage by QEMU/KVM process running the guest VM.- vCPUs per VM - it's commonly recommended to have vCPU count set to 1 (see details in section "CPU overcommit" below).
Defaults: In absence of explicitly set values each domain has 1024 shares set by default.
It's outlined that linux scheduler doesn't perform well in case of CPU overcommitment and if it's not caused real need (like having multi-core VM to perform build/compile, running application inside that can effectively utilize multiple cores and was designed for parallel processing) and widely recommended to use one vCPU per VM otherwise you can expect performance degradation.
It is not recommended to have more than 10 virtual CPUs per physical processor core. Any number of overcommitted virtual CPUs above the number of physical processor cores may cause problems with certain virtualized guests, so it's always up to cluster administrators how to set up number vCPUs per VMs.
See more considerations on KVM limitations
- By default, all VMs are created with 1 vCPU.
To change vCPU number for VM-Pod you have to add annotation
VirtletVCPUCount
with desired number, see examples/cirros-vm.yaml. - Due to p.2 in "Libvirt CPU Allocation" Virtlet spreads the assigned CPU resource limit equally among VM's vCPU threads.
- According to p.3 in "Libvirt CPU Allocation" Virtlet must set limits for emulator threads(those excluding vcpus). At this time Virtlet doesn't support setting these values, but there are plans to fix this in future.
Setting memory limit to 0 or omitting it means there's no memory limit for the container. K8s doesn't support swap on the nodes (for example, k8s creates docker containers with --memory-swappiness=0, see more at kubernetes/kubernetes#7294).
memory
- allocated RAM memory at VM boot.memtune=>hard_limit
- cgroup memory limit on all domain including qemu itself usage. However, it's claimed that such limit should be set accurately.- Swap unlimited by default.
Overcommit memory value can reach ~150% of physical RAM amount. This relies on assumption that most processes do not access 100% of their allocated memory all the time. So you can grant guest VMs more RAM than actually is available on the host. However, this strongly depends on memory swap size available on the node and workloads of VMs memory consumptions.
For more details check Overcommitting with KVM
- By default, each VM is assigned 1GB of RAM. To set other value you need set resource memory limit for container, see examples/cirros-vm.yaml.
- Virtlet generates domain XML with memoryBacking=locked setting to prevent swapping out domain's pages.
-
Implement CRI container stats methods for Virtlet.
-
According to 2 and 3 in "Libvirt CPU Allocation" we need to invent some rule of setting CFS CPU bandwidth limit spread among QEMU and vCPU threads, so as to make k8s scheduler have right assumptions about the resources allocated on the node.
-
Research how to configure the hard limits on memory for VM pod.