-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Metrics-server failing to scrape nodes due to timeout and DNS lookup errors #14708
Comments
Cool, thanks again for the debugging effort. Will get to work on a fix soon. |
Hi, thanks for the quick fix. |
hello there, i'm having kind of the same issue. I have these logs PS C:\windows\system32> kubectl logs -f metrics-server-58b7f877fc-67txx -n kube-system metrics-server doesn't seem to see the other nodes in the cluster although they all have the same configuration and they all have the port 10250 TCP configured on the sg I just wanna add that we're running metrics-server 0.6.1 and EKS 1.25. I've already applied all the hacks and workarrounds mentioned, like metric-resolution; preferred address-types , --kubelet-insecure-tls=true and none of them help solve the issue. please anyone here to help?? |
Hi @sichiba, did you figure out this? I am having the same issue on eks 1.25. |
Hi @mbhattrh23 actually that was due to a security group. we figured out that port 10250 should be open on sg of eks cluster (both controll and data plane). |
Hi Team, |
/kind bug
I'm trying to set up a new cluster on GCE with metrics-server enabled. I'm facing two issues, which are not related to each other.
tcp:10250
tonodes-to-master
firewall rule by hand.k top no
multiple times results will be different regarding which pod servers the request. I think I've managed to fix it by setting--kubelet-preferred-address-types=InternalIP
based on this comment: Metrics server issue with hostname resolution of kubelet and apiserver unable to communicate with metric-server clusterIP kubernetes-sigs/metrics-server#131. I've checkeddns-controller
pod and it looks okay to me (no errors).1. What
kops
version are you running? The commandkops version
, will displaythis information.
Client version: 1.25.3 (git-v1.25.3)
2. What Kubernetes version are you running?
kubectl version
will print theversion if a cluster is running or provide the Kubernetes version specified as
a
kops
flag.3. What cloud provider are you using?
GCE
4. What commands did you run? What is the simplest way to reproduce this issue?
5. What happened after the commands executed?
Cluster validation passes and all pods are running, but
k top no
returns values for a subset of nodes only.6. What did you expect to happen?
k top no
returns values for all nodes consistently.7. Please provide your cluster manifest. Execute
kops get --name my.example.com -o yaml
to display your cluster manifest.You may want to remove your cluster name and other sensitive information.
8. Please run the commands with most verbose logging by adding the
-v 10
flag.Paste the logs into this report, or in a gist and provide the gist link here.
Initial
k top no
(no masters and one node is missing).After firewall fix
k top no
(one pod can scrape all hosts, one pod can not scrape two nodes)After
--kubelet-preferred-address-types
fixk top no
works as expected and there are no errors in logs.Logs from the metrics-server pod before
--kubelet-preferred-address-types
fix. Two kinds ofFailed to scrape node
errors:context deadline exceeded
andno such host
. I've pasted only the last 50 lines of log from each pod.Pod a:
Pod b:
9. Anything else do we need to know?
The text was updated successfully, but these errors were encountered: