Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fedora CoreOS nodes report half of actual CPU capacity #902

Closed
bendrucker opened this issue Dec 4, 2020 · 2 comments
Closed

Fedora CoreOS nodes report half of actual CPU capacity #902

bendrucker opened this issue Dec 4, 2020 · 2 comments

Comments

@bendrucker
Copy link
Contributor

Description

On Fedora CoreOS on AWS, nodes report half the CPU cores compared to the actual vCPUs offered by the instance. Running nproc on the host returns half the expected value and Kubernetes nodes, via /proc/cpuinfo, report half the expected CPU capacity.

Steps to Reproduce

Deploy a Fedora CoreOS cluster on AWS with any instance type with 2 or more vCPUs (e.g. t3.medium with 2). Then:

kubectl get node -o json | jq -r '.items[] | .status.capacity.cpu'

Result is, given a 2 node cluster:

1
1

The scheduler observes this capacity and will not schedule a pod requesting 1100m CPU.

Expected behavior

2
2

Each node should report 2 CPUs

Environment

Possible Solution

This probably applies to other cloud providers, but I've only reproduced it on AWS. It seems like it might not affect bare metal.

This issue has been discussed in the CoreOS issue tracker:

coreos/fedora-coreos-tracker#413
coreos/fedora-coreos-tracker#181

Running the following disables Fedora CoreOS's default simultaneous multithreading restriction and results in a correct node size:

rpm-ostree kargs --delete mitigations --reboot

The current official recommendation is to run this as a systemd unit:

https://docs.fedoraproject.org/be/fedora-coreos/kernel-args/

I've tested this and adding the following unit config fixes the capacity:

systemd:
  units:
    - name: enable-smt.service
      enabled: true
      contents: |
        # https://docs.fedoraproject.org/be/fedora-coreos/kernel-args/
        [Unit]
        Description=Enable simultaneous multithreading
        Before=kubelet.service
        # We run after `systemd-machine-id-commit.service` to ensure that
        # `ConditionFirstBoot=true` services won't rerun on the next boot.
        After=systemd-machine-id-commit.service
        ConditionKernelCommandLine=mitigations
        
        [Service]
        Type=oneshot
        RemainAfterExit=yes
        ExecStart=/bin/rpm-ostree kargs --delete mitigations --reboot

        [Install]
        RequiredBy=kubelet.service
        WantedBy=multi-user.target

Undoing this default has potentially important implications, but it's the only way I can find to ensure correct capacity detection.

@dghubble
Copy link
Member

dghubble commented Dec 4, 2020

Fedora CoreOS has SMT disabled on certain platforms. I'm fine with those defaults and their judgement. You should discuss with https://github.com/coreos/fedora-coreos-tracker if you think the default protections are no longer needed.

In your example, the apparent vCPUs detected on the host and by Kubernetes match, which is expected. It is half the count EC2 quotes / apparent count on SMT enabled OSes.

Related: In the past, Kubernetes detection did mismatch vCPU count compared with the host on SMT disabled systems. That was fixed upstream but has some tangential reading kubernetes/kubernetes#91795

@dghubble dghubble closed this as completed Dec 4, 2020
@dghubble
Copy link
Member

dghubble commented Dec 4, 2020

As always, folks can use snippets to add systemd units (at your own risk). Fedora CoreOS treating an t3.medium as having 1 vCPU isn't flatly incorrect, just different choices.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants