You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
On Fedora CoreOS on AWS, nodes report half the CPU cores compared to the actual vCPUs offered by the instance. Running nproc on the host returns half the expected value and Kubernetes nodes, via /proc/cpuinfo, report half the expected CPU capacity.
Steps to Reproduce
Deploy a Fedora CoreOS cluster on AWS with any instance type with 2 or more vCPUs (e.g. t3.medium with 2). Then:
In your example, the apparent vCPUs detected on the host and by Kubernetes match, which is expected. It is half the count EC2 quotes / apparent count on SMT enabled OSes.
Related: In the past, Kubernetes detection did mismatch vCPU count compared with the host on SMT disabled systems. That was fixed upstream but has some tangential reading kubernetes/kubernetes#91795
As always, folks can use snippets to add systemd units (at your own risk). Fedora CoreOS treating an t3.medium as having 1 vCPU isn't flatly incorrect, just different choices.
Description
On Fedora CoreOS on AWS, nodes report half the CPU cores compared to the actual vCPUs offered by the instance. Running
nproc
on the host returns half the expected value and Kubernetes nodes, via/proc/cpuinfo
, report half the expected CPU capacity.Steps to Reproduce
Deploy a Fedora CoreOS cluster on AWS with any instance type with 2 or more vCPUs (e.g.
t3.medium
with 2). Then:Result is, given a 2 node cluster:
The scheduler observes this capacity and will not schedule a pod requesting
1100m
CPU.Expected behavior
Each node should report 2 CPUs
Environment
Possible Solution
This probably applies to other cloud providers, but I've only reproduced it on AWS. It seems like it might not affect bare metal.
This issue has been discussed in the CoreOS issue tracker:
coreos/fedora-coreos-tracker#413
coreos/fedora-coreos-tracker#181
Running the following disables Fedora CoreOS's default simultaneous multithreading restriction and results in a correct node size:
The current official recommendation is to run this as a systemd unit:
https://docs.fedoraproject.org/be/fedora-coreos/kernel-args/
I've tested this and adding the following unit config fixes the capacity:
Undoing this default has potentially important implications, but it's the only way I can find to ensure correct capacity detection.
The text was updated successfully, but these errors were encountered: