Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kubectl using 1200% CPU on MacOS 14.4.1 #1668

Open
philippefutureboy opened this issue Oct 8, 2024 · 6 comments
Open

kubectl using 1200% CPU on MacOS 14.4.1 #1668

philippefutureboy opened this issue Oct 8, 2024 · 6 comments
Labels
kind/bug Categorizes issue or PR as related to a bug. kind/support Categorizes issue or PR as a support question. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one.

Comments

@philippefutureboy
Copy link

What happened:

I always keep a terminal open with watch "kubectl get pods" while I work, so that if I can at a glance see the status of my remote cluster.
I noticed today while working that my computer was sluggish. When looking in activity monitor, kubectl was running at 1200% (12 full CPU cores) CPU usage, with low memory usage. At that time, watch "kubectl get pods" had been running for 5d 14h, polling state every 2s while my laptop is not in sleep mode.
I killed the command watch "kubectl get pods" and the process successfully exited, releasing the CPU load.

What you expected to happen:

Not eat 12 full CPU, it's polling once every 2 sec.

How to reproduce it (as minimally and precisely as possible):

No idea really! Anything I can do to help diagnose this?
The only reason that I'm posting here is that high CPU usage like this can be indicative of an exploited security vulnerability, and thus why I'm taking proactive action to open this issue.

  • Is there any sort of log that is left by kubectl locally or remotely (our cluster is in GCP K8S)?
  • How do I check the integrity of the program (checksum perhaps?)?

I think my kubectl is packaged directly with gcloud. I'm not sure; how do I check?

Anything else we need to know?:

Environment:

  • Kubernetes client and server versions (use kubectl version):

Client Version: v1.30.4
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.30.4-gke.1348000

  • Cloud provider or hardware configuration: Google Cloud Platform, K8S; locally, MacBook Pro 2019 Intel Core i9
  • OS (e.g: cat /etc/os-release): MacOS 14.4.1 Sonoma
@philippefutureboy philippefutureboy added the kind/bug Categorizes issue or PR as related to a bug. label Oct 8, 2024
@k8s-ci-robot
Copy link
Contributor

This issue is currently awaiting triage.

SIG CLI takes a lead on issue triage for this repo, but any Kubernetes member can accept issues by applying the triage/accepted label.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Oct 8, 2024
@ardaguclu
Copy link
Member

ardaguclu commented Oct 8, 2024

/kind support
I'd recommend passing -v=9 to see what is happening.

@k8s-ci-robot k8s-ci-robot added the kind/support Categorizes issue or PR as a support question. label Oct 8, 2024
@philippefutureboy
Copy link
Author

Thank @ardaguclu. I've added the flag and will be monitoring CPU usage. If anything happens I'll let you know :)

@brianpursley
Copy link
Member

How do I check the integrity of the program (checksum perhaps?)?

The sha512 hash (for the gz) is published in the changelog.
For example, https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.30.md#client-binaries

Something like this should work:

  1. Download the client binaries archive.
  2. Compute the hash of the archive you downloaded to confirm it matches what the changelog says it should be.
  3. Extract the archive.
  4. Compute the hash for the extracted binary (This is the expected hash).
  5. Compute the hash for your local binary and compare to confirm that it matches what you got in step 4.

Example (your will want to use darwin instead of linux-amd64):

~/Downloads $ shasum -a 512 kubernetes-client-linux-amd64.tar.gz 
7551aba20eef3e2fb2076994a1a524b2ea2ecd85d47525845af375acf236b8afd1cd6873815927904fb7d6cf7375cfa5c56cedefad06bf18aa7d6d46bd28d287  kubernetes-client-linux-amd64.tar.gz
~/Downloads $ tar xvf kubernetes-client-linux-amd64.tar.gz kubernetes
kubernetes/
kubernetes/client/
kubernetes/client/bin/
kubernetes/client/bin/kubectl
kubernetes/client/bin/kubectl-convert
~/Downloads $ shasum -a 512 kubernetes/client/bin/kubectl
1adba880a67e8ad9aedb82cde90343a6656147e7f331ab2e2293d4bc16a280591bd3b912873f099f11cde2f044d8698b963ed45fadedfe1735d99158e21e44a0  kubernetes/client/bin/kubectl

Then get your local kubectl's hash and compare it...

shasum -a 512 $(which kubectl)

@brianpursley
Copy link
Member

The interesting thing about this is that kubectl is not running for 5 days, it is being invoked by watch every 2 seconds for 5 days.

In addition to using -v=9 as @ardaguclu suggested...

If it happens again, try doing the following in another terminal, while the problem is occurring, to collect information that might be helpful to diagnose the problem:

ps -F $(pgrep kubectl)
pgrep kubectl | xargs -L1 lsof -p

@philippefutureboy
Copy link
Author

Fantastic @brianpursley, thanks for the additional tips! I will be checking the checksum tomorrow :)

The interesting thing about this is that kubectl is not running for 5 days, it is being invoked by watch every 2 seconds for 5 days.

I was thinking the same thing - after 5 days, maybe that there's some kind of low-level error that leads to more CPU consumption or some data that accumulates. But every 2 second one execution? Shouldn't be an issue.

I'll follow up shortly!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. kind/support Categorizes issue or PR as a support question. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one.
Projects
None yet
Development

No branches or pull requests

4 participants