-
Notifications
You must be signed in to change notification settings - Fork 0
CPUThrottlingHigh on metrics server (Prometheus alert)
This is a special case of the general CPUThrottlingHigh alert on the metrics-server deployment. When the alert occurs elsewhere, see the general case.
If you use Robusta this alert is automatically analyzed.
The default CPU limits for metrics-server are too low resulting in CPU starvation. When possible, this should be fixed so your cluster runs more smoothly. There are two important caveats for fixing this:
- metrics-server dynamically updates its CPU limits using Kubernetes addon-resizer so you cannot update the CPU limits in the normal way. See instructions below for how to correctly update the limits.
- You cannot fix this issue on GKE. Any changes you make to the metrics-server deployment on GKE are reverted by GCP.
metrics-server does not respect normal CPU limits. To fix this issue, edit the metrics-server Deployment and increase the --cpu
parameter for the metrics-server-nanny
container. See line in bold below. A good value for most clusters is 100m.
apiVersion: apps/v1
kind: Deployment
metadata:
name: metrics-server-v0.3.6
namespace: kube-system
spec:
template:
spec:
containers:
...
- command:
- /pod_nanny
- '--config-dir=/etc/config'
- '--cpu=40m'
- '--extra-cpu=0.5m'
- '--memory=35Mi'
- '--extra-memory=4Mi'
- '--threshold=5'
- '--deployment=metrics-server-v0.3.6'
- '--container=metrics-server'
- '--poll-period=300000'
- '--estimator=exponential'
- '--scale-down-delay=24h'
- '--minClusterSize=5'
- '--use-metrics=true'
image: 'gke.gcr.io/addon-resizer:1.8.11-gke.0'
name: metrics-server-nanny
This issue can be fixed on GKE only by updating your GKE clusters to a more recent Kubernetes version. The real-world impact of this issue is often neglible. To ignore this issue you can change the CPUThrottlingHigh alert in Prometheus rules to exclude metrics-server
.
If you use Robusta, no configuration is necessary. This alert is automatically silenced for GKE metrics-server pods.
This page is licensed under the CC BY-SA 4.0 license. You are free to share and re-use this content but you MUST provide a link back to this page and attribute the material to the Robusta open source project.