Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot fetch Pods Metrics on Kubernetes 1.19 #629

Closed
osaffer opened this issue Nov 6, 2020 · 21 comments
Closed

Cannot fetch Pods Metrics on Kubernetes 1.19 #629

osaffer opened this issue Nov 6, 2020 · 21 comments

Comments

@osaffer
Copy link

osaffer commented Nov 6, 2020

What happened:
Cannot fetch pods metrics, while can fetch nodes metrics
What you expected to happen:
Fetch pods and nodes metrics

Environment:

  • Kubernetes distribution (GKE, EKS, Kubeadm, the hard way, etc.):
  • Container Network Setup: flannel
    kubectl version --short
    Client Version: v1.19.3
    Server Version: v1.19.3

helm list -n operations
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
metrics-server operations 2 2020-11-05 23:26:48.971057805 +0100 CET deployed metrics-server-4.5.2 0.3.7

kubectl top nodes
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
master 251m 6% 2433Mi 15%
worker1 577m 14% 9957Mi 62%
worker2 418m 10% 8143Mi 50%
worker3 445m 11% 7755Mi 48%

kubectl top pods
W1106 20:50:15.238493 73592 top_pod.go:265] Metrics not available for pod default/quickstart-es-default-0, age: 22h29m11.238478174s
error: Metrics not available for pod default/quickstart-es-default-0, age: 22h29m11.238478174s

kubectl get apiservices | grep metrics
v1beta1.metrics.k8s.io operations/metrics-server True

Command in deployment

  • command:
    - metrics-server
    - --v=2
    - --secure-port=8443
    - --kubelet-insecure-tls
    - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname

Logs 👍 1 reststorage.go:160] unable to fetch pod metrics for pod falco/falco-7l2br: no metrics known for pod
E1106 19:53:35.125749 1 reststorage.go:160] unable to fetch pod metrics for pod nginx-ingress/nginx-ingress-contoller-nginx-ingress-controller-fr7dr: no metrics known for pod
E1106 19:53:35.125753 1 reststorage.go:160] unable to fetch pod metrics for pod rook-ceph/rook-ceph-mon-c-796f667d67-f52z9: no metrics known for pod
E1106 19:53:35.125758 1 reststorage.go:160] unable to fetch pod metrics for pod rook-ceph/csi-cephfsplugin-nlgkh: no metrics known for pod
E1106 19:53:35.125763 1 reststorage.go:160] unable to fetch pod metrics for pod rook-ceph/rook-ceph-operator-86756d44-p99zn: no metrics known for pod
E1106 19:53:35.125768 1 reststorage.go:160] unable to fetch pod metrics for pod rook-ceph/csi-cephfsplugin-7zbgw: no metrics known for pod
E1106 19:53:35.125772 1 reststorage.go:160] unable to fetch pod metrics for pod istio-system/istiod-5d884bcc7-vwmd7: no metrics known for pod
E1106 19:53:35.125777 1 reststorage.go:160] unable to fetch pod metrics for pod civi/civi-postgresql-ha-postgresql-0: no metrics known for pod
E1106 19:53:35.125782 1 reststorage.go:160] unable to fetch pod metrics for pod civi/civi-postgresql-ha-pgpool-5fc9b6c697-xm56b: no metrics known for pod
E1106 19:53:35.125787 1 reststorage.go:160] unable to fetch pod metrics for pod istio-system/jaeger-75948789b4-vwf4c: no metrics known for pod
E1106 19:53:35.125791 1 reststorage.go:160] unable to fetch pod metrics for pod rook-ceph/rook-ceph-mgr-a-58f68569bd-kjbxb: no metrics known for pod
E1106 19:53:35.125796 1 reststorage.go:160] unable to fetch pod metrics for pod kube-system/kube-proxy-czrq4: no metrics known for pod
E1106 19:53:44.207989 1 reststorage.go:160] unable to fetch pod metrics for pod istio-system/istiod-5d884bcc7-vwmd7: no metrics known for pod
E1106 19:53:44.212919 1 reststorage.go:160] unable to fetch pod metrics for pod istio-system/istio-ingressgateway-56b8d79bfc-cmllm: no metrics known for pod
E1106 19:53:59.216758 1 reststorage.go:160] unable to fetch pod metrics for pod istio-system/istiod-5d884bcc7-vwmd7: no metrics known for pod
E1106 19:53:59.226368 1 reststorage.go:160] unable to fetch pod metrics for pod istio-system/istio-ingressgateway-56b8d79bfc-cmllm: no metrics known for pod
E1106 19:54:15.479157 1 reststorage.go:160] unable to fetch pod metrics for pod istio-system/istiod-5d884bcc7-vwmd7: no metrics known for pod
E1106 19:54:15.483252 1 reststorage.go:160] unable to fetch pod metrics for pod istio-system/istio-ingressgateway-56b8d79bfc-cmllm: no metrics known for pod

So how to fetch metrics from pods ?

Thank you

@kewynakshlley
Copy link

I am facing the same issue, did you find any solution for this?

@serathius
Copy link
Contributor

serathius commented Nov 8, 2020

This looks like a bug in Kubernetes 1.19 kubernetes/kubernetes#94281
I think I will prepare debug instructions as this will be common issue reported when 1.19 becomes more popular

@kewynakshlley
Copy link

@serathius I am not sure if it is only in Kubernetes 1.19. I've already done tests on 18.5 and 18.10 and the error persists.

@serathius
Copy link
Contributor

@kewynakshlley If your experiencing problem on different K8s version please file a separate issue and provide all information requested in issue template.

@osaffer
Copy link
Author

osaffer commented Nov 9, 2020

It s quite frustrating that metrics-server doesn't work , from one side I am glad to know I am not alone in this case, to get kubernetes and don't be able to use autoscalling is strange for me 🤪.

I tried different deployment with helm or by using manifests, without success.

@serathius
Copy link
Contributor

serathius commented Nov 9, 2020

Sorry for that, unfortunately Metrics Server is very infrastructure dependent.

@serathius
Copy link
Contributor

PR adding debug instructions for issue in Kubernetes 1.19 #632

@serathius
Copy link
Contributor

serathius commented Nov 9, 2020

@osaffer Please use those instructions to confirm that you're affected by problem in Kubernetes 1.19:

Please check if your Kubelet is correctly returning pod metrics. You can do that by checking Summary API on any node in your cluster:

NODE_NAME=<Name of node in your cluster>
kubectl get --raw /api/v1/nodes/$NODE_NAME/proxy/stats/summary

This will return JSON that will have two keys node and pods.
Empty list of pods is means that problem is related to Kubelet and not to Metrics Server.

One liner for number of pod metrics reported by first node in cluster (requires jq):

kubectl get --raw /api/v1/nodes/$(kubectl get nodes -o json  | jq -r '.items[0].metadata.name')/proxy/stats/summary | jq '.pods | length'

@osaffer
Copy link
Author

osaffer commented Nov 10, 2020

Hi,

These are the results :

kubectl get --raw /api/v1/nodes/$NODE_NAME/proxy/stats/summary
{
"node": {
"nodeName": "worker3",
"systemContainers": [
{
"name": "kubelet",
"startTime": "2020-11-06T19:43:22Z",
"cpu": {
"time": "2020-11-10T10:35:47Z",
"usageNanoCores": 104067805,
"usageCoreNanoSeconds": 30073725495861
},
"memory": {
"time": "2020-11-10T10:35:47Z",
"usageBytes": 113795072,
"workingSetBytes": 52789248,
"rssBytes": 104919040,
"pageFaults": 38353152,
"majorPageFaults": 2
}
},
{
"name": "runtime",
"startTime": "2020-11-05T21:15:48Z",
"cpu": {
"time": "2020-11-10T10:35:49Z",
"usageNanoCores": 49204409,
"usageCoreNanoSeconds": 98644199471886
},
"memory": {
"time": "2020-11-10T10:35:49Z",
"usageBytes": 2341875712,
"workingSetBytes": 742887424,
"rssBytes": 94801920,
"pageFaults": 6645899,
"majorPageFaults": 143
}
},
{
"name": "pods",
"startTime": "2020-11-05T21:15:47Z",
"cpu": {
"time": "2020-11-10T10:35:56Z",
"usageNanoCores": 255078706,
"usageCoreNanoSeconds": 981458173961831
},
"memory": {
"time": "2020-11-10T10:35:56Z",
"availableBytes": 9914273792,
"usageBytes": 7595991040,
"workingSetBytes": 6948651008,
"rssBytes": 6426144768,
"pageFaults": 0,
"majorPageFaults": 0
}
}
],
"startTime": "2020-09-02T09:54:18Z",
"cpu": {
"time": "2020-11-10T10:35:56Z",
"usageNanoCores": 457364792,
"usageCoreNanoSeconds": 2181217395455123
},
"memory": {
"time": "2020-11-10T10:35:56Z",
"availableBytes": 6704893952,
"usageBytes": 13317599232,
"workingSetBytes": 10158030848,
"rssBytes": 6825885696,
"pageFaults": 749813,
"majorPageFaults": 287
},
"network": {
"time": "2020-11-10T10:35:56Z",
"name": "eth0",
"rxBytes": 1541040392000,
"rxErrors": 0,
"txBytes": 1367396042746,
"txErrors": 0,
"interfaces": [
{
"name": "cni0",
"rxBytes": 1157642933169,
"rxErrors": 0,
"txBytes": 1382981170589,
"txErrors": 0
},
{
"name": "flannel.1",
"rxBytes": 1339468701437,
"rxErrors": 0,
"txBytes": 1178364164668,
"txErrors": 0
},
{
"name": "eth0",
"rxBytes": 1541040392000,
"rxErrors": 0,
"txBytes": 1367396042746,
"txErrors": 0
}
]
},
"fs": {
"time": "2020-11-10T10:35:56Z",
"availableBytes": 5045551104,
"capacityBytes": 25788452864,
"usedBytes": 19409326080,
"inodesFree": 1111988,
"inodes": 1607520,
"inodesUsed": 495532
},
"runtime": {
"imageFs": {
"time": "2020-11-10T10:35:56Z",
"availableBytes": 5045551104,
"capacityBytes": 25788452864,
"usedBytes": 16231692613,
"inodesFree": 1111988,
"inodes": 1607520,
"inodesUsed": 495532
}
},
"rlimit": {
"time": "2020-11-10T10:36:02Z",
"maxpid": 131072,
"curproc": 1899
}
},
"pods": []

kubectl get --raw /api/v1/nodes/$(kubectl get nodes -o json | jq -r '.items[0].metadata.name')/proxy/stats/summary | jq '.pods | length'
0

@osaffer
Copy link
Author

osaffer commented Nov 10, 2020

About version

kubelet --version
Kubernetes v1.19.3
kubectl version --short
Client Version: v1.19.3
Server Version: v1.19.3

@JornShen
Copy link

The fix PR google/cadvisor#2714 has been merged.
waiting for updating k/k dependency of cadvisor to v0.38.
if you are urgent, udpate docker version to v19.X or use this patch google/cadvisor#2714.

@osaffer
@serathius

@osaffer
Copy link
Author

osaffer commented Nov 10, 2020

This is the result of docker version

Should I do something more?

docker version
Client: Docker Engine - Community
Version: 19.03.12
API version: 1.39
Go version: go1.13.10
Git commit: 48a66213fe
Built: Mon Jun 22 15:45:36 2020
OS/Arch: linux/amd64
Experimental: false

Server: Docker Engine - Community
Engine:
Version: 18.09.7
API version: 1.39 (minimum version 1.12)
Go version: go1.10.8
Git commit: 2d0083d
Built: Thu Jun 27 17:23:02 2019
OS/Arch: linux/amd64
Experimental: false

@serathius
Copy link
Contributor

@osaffer Your Docker Engine Server is using 18.09.7 which is affected.
You can upgrade it to v19+ or wait for fix in K8s

@osaffer
Copy link
Author

osaffer commented Nov 10, 2020

Hi

I have just upgraded all my nodes

docker version
Client: Docker Engine - Community
Version: 19.03.13
API version: 1.40
Go version: go1.13.15
Git commit: 4484c46d9d
Built: Wed Sep 16 17:02:36 2020
OS/Arch: linux/amd64
Experimental: false

Server: Docker Engine - Community
Engine:
Version: 19.03.13
API version: 1.40 (minimum version 1.12)
Go version: go1.13.15
Git commit: 4484c46d9d
Built: Wed Sep 16 17:01:06 2020
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: 1.3.7
GitCommit: 8fba4e9a7d01810a393d5d25a3621dc101981175
runc:
Version: 1.0.0-rc10
GitCommit: dc9208a3303feef5b3839f4323d9beb36df0a9dd
docker-init:
Version: 0.18.0
GitCommit: fec3683

Unfortunately still no metrics for pods.

@serathius
Copy link
Contributor

Interesting
Would be great to bring your findings to kubernetes/kubernetes#94281

@serathius
Copy link
Contributor

Have you done same debugging steps provided above to ensure that problem is still on Kubelet side?

@osaffer
Copy link
Author

osaffer commented Nov 11, 2020

I would like to thank you for your support.

  • Upgrade Docker
  • Redeploy metrics-server
  • Restart kubelet on all nodes ( I hadnt do it )

Now it works very well.

Thank you very much

@serathius
Copy link
Contributor

Thanks for working with us on the workaround. Hope this is fixed upstream soon.

Closing as issue was resolved.

@serathius serathius changed the title Cannot fetch Pods Metrics Cannot fetch Pods Metrics on Kubernetes 1.19 Nov 15, 2020
@munipravy
Copy link

@serathius I am facing this issue with kubernetes cluster v 1.17, is this issue cluster version specific ?
If it is not cluster specific, can you please let me know how can I achieve HPA with Prometheus Adaptor ?

@osaffer
Copy link
Author

osaffer commented Nov 24, 2020

Do you at least get the metrics for pods and nodes ?

@serathius
Copy link
Contributor

To my knowledge this bug is specific to 1.19. Please ask in original issue kubernetes/kubernetes#94281

For Prometheus issues please ask on https://github.com/DirectXMan12/k8s-prometheus-adapter

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants