Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to get metrics for nodes #2526

Closed
alekc opened this issue Nov 14, 2022 · 5 comments
Closed

Failed to get metrics for nodes #2526

alekc opened this issue Nov 14, 2022 · 5 comments

Comments

@alekc
Copy link

alekc commented Nov 14, 2022

Host operating system: output of uname -a

Windows 10 / Osx

node_exporter version: output of node_exporter --version

1.3.1

node_exporter command line flags

- name: node-exporter
      image: quay.io/prometheus/node-exporter:v1.3.1
      args:
        - '--path.procfs=/host/proc'
        - '--path.sysfs=/host/sys'
        - '--path.rootfs=/host/root'
        - '--web.listen-address=[$(HOST_IP)]:9100'
        - >-
          --collector.filesystem.mount-points-exclude=^/(dev|proc|sys|var/lib/docker/.+|var/lib/kubelet/.+)($|/)
        - >-
          --collector.filesystem.fs-types-exclude=^(autofs|binfmt_misc|bpf|cgroup2?|configfs|debugfs|devpts|devtmpfs|fusectl|hugetlbfs|iso9660|mqueue|nsfs|overlay|proc|procfs|pstore|rpc_pipefs|securityfs|selinuxfs|squashfs|sysfs|tracefs)$
      ports:

node_exporter log output


Container
node-exporter
Search...
ts=2022-11-14T13:26:13.217Z caller=node_exporter.go:182 level=info msg="Starting node_exporter" version="(version=1.3.1, branch=HEAD, revision=a2321e7b940ddcff26873612bccdf7cd4c42b6b6)"
ts=2022-11-14T13:26:13.217Z caller=node_exporter.go:183 level=info msg="Build context" build_context="(go=go1.17.3, user=root@243aafa5525c, date=20211205-11:10:22)"
ts=2022-11-14T13:26:13.232Z caller=filesystem_common.go:111 level=info collector=filesystem msg="Parsed flag --collector.filesystem.mount-points-exclude" flag=^/(dev|proc|sys|var/lib/docker/.+|var/lib/kubelet/.+)($|/)
ts=2022-11-14T13:26:13.235Z caller=filesystem_common.go:113 level=info collector=filesystem msg="Parsed flag --collector.filesystem.fs-types-exclude" flag=^(autofs|binfmt_misc|bpf|cgroup2?|configfs|debugfs|devpts|devtmpfs|fusectl|hugetlbfs|iso9660|mqueue|nsfs|overlay|proc|procfs|pstore|rpc_pipefs|securityfs|selinuxfs|squashfs|sysfs|tracefs)$
ts=2022-11-14T13:26:13.235Z caller=node_exporter.go:108 level=info msg="Enabled collectors"
ts=2022-11-14T13:26:13.235Z caller=node_exporter.go:115 level=info collector=arp
ts=2022-11-14T13:26:13.235Z caller=node_exporter.go:115 level=info collector=bcache
ts=2022-11-14T13:26:13.235Z caller=node_exporter.go:115 level=info collector=bonding
ts=2022-11-14T13:26:13.235Z caller=node_exporter.go:115 level=info collector=btrfs
ts=2022-11-14T13:26:13.235Z caller=node_exporter.go:115 level=info collector=conntrack
ts=2022-11-14T13:26:13.235Z caller=node_exporter.go:115 level=info collector=cpu
ts=2022-11-14T13:26:13.235Z caller=node_exporter.go:115 level=info collector=cpufreq
ts=2022-11-14T13:26:13.235Z caller=node_exporter.go:115 level=info collector=diskstats

Are you running node_exporter in Docker?

no

What did you do that produced an error?

What did you expect to see?

Information about node/cluster cpu/memory usage

What did you see instead?

only pod data is available.

Cluster (1.22.5) installed via kubeadm

Prometheus is installed via kube-prometheus-stack (41.7.4), Lens version 2022.11.101953-latest

Service is marked as operated-prometheus=true

apiVersion: v1
kind: Service
metadata:
  name: prometheus-operated
  namespace: monitoring
  uid: 38d95fbd-3bbb-4d1a-bbe8-5c055bb8bd87
  resourceVersion: '181372167'
  creationTimestamp: '2022-11-11T10:11:12Z'
  labels:
    operated-prometheus: 'true'
  ownerReferences:
    - apiVersion: monitoring.coreos.com/v1
      kind: Prometheus
      name: prometheus-prometheus
      uid: b7961778-b6eb-4080-96c9-a8b05a8ccc11
  managedFields:
    - manager: PrometheusOperator
      operation: Update
      apiVersion: v1
      time: '2022-11-11T10:11:12Z'
      fieldsType: FieldsV1
      fieldsV1:
        f:metadata:
          f:labels:
            .: {}
            f:operated-prometheus: {}
          f:ownerReferences:
            .: {}
            k:{"uid":"b7961778-b6eb-4080-96c9-a8b05a8ccc11"}: {}
        f:spec:
          f:clusterIP: {}
          f:internalTrafficPolicy: {}
          f:ports:
            .: {}
            k:{"port":9090,"protocol":"TCP"}:
              .: {}
              f:name: {}
              f:port: {}
              f:protocol: {}
              f:targetPort: {}
            k:{"port":10901,"protocol":"TCP"}:
              .: {}
              f:name: {}
              f:port: {}
              f:protocol: {}
              f:targetPort: {}
          f:selector: {}
          f:sessionAffinity: {}
          f:type: {}
  selfLink: /api/v1/namespaces/monitoring/services/prometheus-operated
status:
  loadBalancer: {}
spec:
  ports:
    - name: http-web
      protocol: TCP
      port: 9090
      targetPort: http-web
    - name: grpc
      protocol: TCP
      port: 10901
      targetPort: grpc
  selector:
    app.kubernetes.io/name: prometheus
  clusterIP: None
  clusterIPs:
    - None
  type: ClusterIP
  sessionAffinity: None
  ipFamilies:
    - IPv4
  ipFamilyPolicy: SingleStack
  internalTrafficPolicy: Cluster

as per https://github.com/lensapp/lens/blob/master/troubleshooting/custom-prometheus.md#kube-prometheus , serviceMonitors have been patched

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: >
      {"apiVersion":"monitoring.coreos.com/v1","kind":"ServiceMonitor","metadata":{"annotations":{},"labels":{"app":"kube-prometheus-stack-kubelet","app.kubernetes.io/instance":"kube-prometheus-stack","app.kubernetes.io/managed-by":"Helm","app.kubernetes.io/part-of":"kube-prometheus-stack","app.kubernetes.io/version":"41.7.4","argocd.argoproj.io/instance":"kube-prometheus-stack","chart":"kube-prometheus-stack-41.7.4","heritage":"Helm","release":"kube-prometheus-stack"},"name":"prometheus-kubelet","namespace":"monitoring"},"spec":{"endpoints":[{"bearerTokenFile":"/var/run/secrets/kubernetes.io/serviceaccount/token","honorLabels":true,"port":"https-metrics","relabelings":[{"sourceLabels":["__metrics_path__"],"targetLabel":"metrics_path"}],"scheme":"https","tlsConfig":{"caFile":"/var/run/secrets/kubernetes.io/serviceaccount/ca.crt","insecureSkipVerify":true}},{"bearerTokenFile":"/var/run/secrets/kubernetes.io/serviceaccount/token","honorLabels":true,"metricRelabelings":[{"action":"drop","regex":"container_cpu_(cfs_throttled_seconds_total|load_average_10s|system_seconds_total|user_seconds_total)","sourceLabels":["__name__"]},{"action":"drop","regex":"container_fs_(io_current|io_time_seconds_total|io_time_weighted_seconds_total|reads_merged_total|sector_reads_total|sector_writes_total|writes_merged_total)","sourceLabels":["__name__"]},{"action":"drop","regex":"container_memory_(mapped_file|swap)","sourceLabels":["__name__"]},{"action":"drop","regex":"container_(file_descriptors|tasks_state|threads_max)","sourceLabels":["__name__"]},{"action":"drop","regex":"container_spec.*","sourceLabels":["__name__"]},{"action":"drop","regex":".+;","sourceLabels":["id","pod"]}],"path":"/metrics/cadvisor","port":"https-metrics","relabelings":[{"sourceLabels":["__metrics_path__"],"targetLabel":"metrics_path"}],"scheme":"https","tlsConfig":{"caFile":"/var/run/secrets/kubernetes.io/serviceaccount/ca.crt","insecureSkipVerify":true}},{"bearerTokenFile":"/var/run/secrets/kubernetes.io/serviceaccount/token","honorLabels":true,"path":"/metrics/probes","port":"https-metrics","relabelings":[{"sourceLabels":["__metrics_path__"],"targetLabel":"metrics_path"}],"scheme":"https","tlsConfig":{"caFile":"/var/run/secrets/kubernetes.io/serviceaccount/ca.crt","insecureSkipVerify":true}}],"jobLabel":"k8s-app","namespaceSelector":{"matchNames":["kube-system"]},"selector":{"matchLabels":{"app.kubernetes.io/name":"kubelet","k8s-app":"kubelet"}}}}
  creationTimestamp: '2022-11-11T10:11:11Z'
  generation: 4
  labels:
    app: kube-prometheus-stack-kubelet
    app.kubernetes.io/instance: kube-prometheus-stack
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/part-of: kube-prometheus-stack
    app.kubernetes.io/version: 41.7.4
    argocd.argoproj.io/instance: kube-prometheus-stack
    chart: kube-prometheus-stack-41.7.4
    heritage: Helm
    release: kube-prometheus-stack
  managedFields:
    - apiVersion: monitoring.coreos.com/v1
      fieldsType: FieldsV1
      fieldsV1:
        f:metadata:
          f:labels:
            f:app: {}
            f:app.kubernetes.io/instance: {}
            f:app.kubernetes.io/managed-by: {}
            f:app.kubernetes.io/part-of: {}
            f:app.kubernetes.io/version: {}
            f:argocd.argoproj.io/instance: {}
            f:chart: {}
            f:heritage: {}
            f:release: {}
        f:spec:
          f:endpoints: {}
          f:jobLabel: {}
          f:namespaceSelector:
            f:matchNames: {}
          f:selector: {}
      manager: argocd-controller
      operation: Apply
      time: '2022-11-14T14:33:05Z'
    - apiVersion: monitoring.coreos.com/v1
      fieldsType: FieldsV1
      fieldsV1:
        f:metadata:
          f:annotations:
            .: {}
            f:kubectl.kubernetes.io/last-applied-configuration: {}
          f:labels:
            .: {}
            f:app: {}
            f:app.kubernetes.io/instance: {}
            f:app.kubernetes.io/managed-by: {}
            f:app.kubernetes.io/part-of: {}
            f:app.kubernetes.io/version: {}
            f:argocd.argoproj.io/instance: {}
            f:chart: {}
            f:heritage: {}
            f:release: {}
        f:spec:
          .: {}
          f:jobLabel: {}
          f:namespaceSelector:
            .: {}
            f:matchNames: {}
          f:selector: {}
      manager: argocd-application-controller
      operation: Update
      time: '2022-11-11T10:11:11Z'
  name: prometheus-kubelet
  namespace: monitoring
  resourceVersion: '184365166'
  uid: b8993ab2-423c-40d1-8b65-f90b8c0971ee
  selfLink: >-
    /apis/monitoring.coreos.com/v1/namespaces/monitoring/servicemonitors/prometheus-kubelet
spec:
  endpoints:
    - bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
      honorLabels: true
      port: https-metrics
      relabelings:
        - action: replace
          sourceLabels:
            - __metrics_path__
          targetLabel: metrics_path
        - action: replace
          sourceLabels:
            - node
          targetLabel: instance
      scheme: https
      tlsConfig:
        caFile: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        insecureSkipVerify: true
    - bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
      honorLabels: true
      metricRelabelings:
        - action: drop
          regex: >-
            container_cpu_(cfs_throttled_seconds_total|load_average_10s|system_seconds_total|user_seconds_total)
          sourceLabels:
            - __name__
        - action: drop
          regex: >-
            container_fs_(io_current|io_time_seconds_total|io_time_weighted_seconds_total|reads_merged_total|sector_reads_total|sector_writes_total|writes_merged_total)
          sourceLabels:
            - __name__
        - action: drop
          regex: container_memory_(mapped_file|swap)
          sourceLabels:
            - __name__
        - action: drop
          regex: container_(file_descriptors|tasks_state|threads_max)
          sourceLabels:
            - __name__
        - action: drop
          regex: container_spec.*
          sourceLabels:
            - __name__
        - action: drop
          regex: .+;
          sourceLabels:
            - id
            - pod
      path: /metrics/cadvisor
      port: https-metrics
      relabelings:
        - action: replace
          sourceLabels:
            - __metrics_path__
          targetLabel: metrics_path
      scheme: https
      tlsConfig:
        caFile: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        insecureSkipVerify: true
    - bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
      honorLabels: true
      path: /metrics/probes
      port: https-metrics
      relabelings:
        - action: replace
          sourceLabels:
            - __metrics_path__
          targetLabel: metrics_path
      scheme: https
      tlsConfig:
        caFile: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        insecureSkipVerify: true
  jobLabel: k8s-app
  namespaceSelector:
    matchNames:
      - kube-system
  selector:
    matchLabels:
      app.kubernetes.io/name: kubelet
      k8s-app: kubelet

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: >
      {"apiVersion":"monitoring.coreos.com/v1","kind":"ServiceMonitor","metadata":{"annotations":{},"labels":{"app.kubernetes.io/component":"metrics","app.kubernetes.io/instance":"kube-prometheus-stack","app.kubernetes.io/managed-by":"Helm","app.kubernetes.io/name":"prometheus-node-exporter","app.kubernetes.io/part-of":"prometheus-node-exporter","app.kubernetes.io/version":"1.3.1","argocd.argoproj.io/instance":"kube-prometheus-stack","helm.sh/chart":"prometheus-node-exporter-4.4.2","jobLabel":"node-exporter","release":"kube-prometheus-stack"},"name":"kube-prometheus-stack-prometheus-node-exporter","namespace":"monitoring"},"spec":{"endpoints":[{"port":"http-metrics","scheme":"http"}],"jobLabel":"jobLabel","selector":{"matchLabels":{"app.kubernetes.io/instance":"kube-prometheus-stack","app.kubernetes.io/name":"prometheus-node-exporter"}}}}
  creationTimestamp: '2022-11-11T10:11:10Z'
  generation: 4
  labels:
    app.kubernetes.io/component: metrics
    app.kubernetes.io/instance: kube-prometheus-stack
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: prometheus-node-exporter
    app.kubernetes.io/part-of: prometheus-node-exporter
    app.kubernetes.io/version: 1.3.1
    argocd.argoproj.io/instance: kube-prometheus-stack
    helm.sh/chart: prometheus-node-exporter-4.4.2
    jobLabel: node-exporter
    release: kube-prometheus-stack
  managedFields:
    - apiVersion: monitoring.coreos.com/v1
      fieldsType: FieldsV1
      fieldsV1:
        f:metadata:
          f:labels:
            f:app.kubernetes.io/component: {}
            f:app.kubernetes.io/instance: {}
            f:app.kubernetes.io/managed-by: {}
            f:app.kubernetes.io/name: {}
            f:app.kubernetes.io/part-of: {}
            f:app.kubernetes.io/version: {}
            f:argocd.argoproj.io/instance: {}
            f:helm.sh/chart: {}
            f:jobLabel: {}
            f:release: {}
        f:spec:
          f:endpoints: {}
          f:jobLabel: {}
          f:selector: {}
      manager: argocd-controller
      operation: Apply
      time: '2022-11-14T13:52:53Z'
    - apiVersion: monitoring.coreos.com/v1
      fieldsType: FieldsV1
      fieldsV1:
        f:metadata:
          f:annotations:
            .: {}
            f:kubectl.kubernetes.io/last-applied-configuration: {}
          f:labels:
            .: {}
            f:app.kubernetes.io/component: {}
            f:app.kubernetes.io/instance: {}
            f:app.kubernetes.io/managed-by: {}
            f:app.kubernetes.io/name: {}
            f:app.kubernetes.io/part-of: {}
            f:app.kubernetes.io/version: {}
            f:argocd.argoproj.io/instance: {}
            f:helm.sh/chart: {}
            f:jobLabel: {}
            f:release: {}
        f:spec:
          .: {}
          f:jobLabel: {}
          f:selector: {}
      manager: argocd-application-controller
      operation: Update
      time: '2022-11-11T10:11:10Z'
  name: kube-prometheus-stack-prometheus-node-exporter
  namespace: monitoring
  resourceVersion: '184338622'
  uid: f12654e3-4e7c-48ab-8c4a-449ff4535dff
  selfLink: >-
    /apis/monitoring.coreos.com/v1/namespaces/monitoring/servicemonitors/kube-prometheus-stack-prometheus-node-exporter
spec:
  endpoints:
    - port: http-metrics
      relabelings:
        - action: replace
          regex: (.*)
          replacement: $1
          sourceLabels:
            - __meta_kubernetes_pod_node_name
          targetLabel: kubernetes_node
      scheme: http
  jobLabel: jobLabel
  selector:
    matchLabels:
      app.kubernetes.io/instance: kube-prometheus-stack
      app.kubernetes.io/name: prometheus-node-exporter

and respective jobs shows those metrics

apiserver_audit_event_total{endpoint="https-metrics", instance="alekc-master-01", job="kubelet", metrics_path="/metrics", namespace="kube-system", node="alekc-master-01", service="prometheus-kubelet"}
...
count:up1{container="node-exporter", endpoint="http-metrics", job="node-exporter", kubernetes_node="alekc-master-01", namespace="monitoring", service="kube-prometheus-stack-prometheus-node-exporter"}

when starting Lens, the error in the logs is

info: [CONTEXT-HANDLER]: using operator as prometheus provider for clusterId=f0deab80dc904a62c31490310a83567a
warn: [METRICS-ROUTE]: failed to get metrics for clusterId=f0deab80dc904a62c31490310a83567a: Metrics not available {"stack":"Error: Metrics not available\n    at C:\\Users\\Alekc\\AppData\\Local\\Programs\\Lens\\resources\\app.asar\\static\\build\\main.js:1:416849\n    at process.processTicksAndRejections (node:internal/process/task_queues:96:5)\n    at async Promise.all (index 15)\n    at async C:\\Users\\Alekc\\AppData\\Local\\Programs\\Lens\\resources\\app.asar\\static\\build\\main.js:1:417597\n    at async Object.route (C:\\Users\\Alekc\\AppData\\Local\\Programs\\Lens\\resources\\app.asar\\static\\build\\main.js:1:407338)\n    at async a.route (C:\\Users\\Alekc\\AppData\\Local\\Programs\\Lens\\resources\\app.asar\\static\\build\\main.js:1:409755)"}

@SuperQ
Copy link
Member

SuperQ commented Nov 14, 2022

For questions/help/support please use our community channels. There are more people available to potentially respond to your request and the whole community can benefit from the answers provided.

@SuperQ SuperQ closed this as completed Nov 14, 2022
@alekc
Copy link
Author

alekc commented Nov 14, 2022

@SuperQ uhm, how come it's categorized as a help/support? This is a bug since Lens doesn't behave like it should be following the documentation.

Node/Cluster data doesn't work, while
image

Pods info does
image

Other issues (categorized as bug) for the same problem exist and still open (i.e. lensapp/lens#6510, my issue is slightly different because I do not get 404, hence a new report)

Edit: wrong screenshots

@alekc
Copy link
Author

alekc commented Nov 14, 2022

After having found the source code https://github.com/lensapp/lens/blob/master/src/main/prometheus/lens.ts I've probably discovered the issue.

Node exporter data was duplicated on my cluster (due to the same service being consumed twice with different logic), so the output of sum(node_memory_MemTotal_bytes - (node_memory_MemFree_bytes + node_memory_Buffers_bytes + node_memory_Cached_bytes)) by (kubernetes_name) was producing 2 entries instead of one.

Ideally, the code should pick this edge case and instead of a generic "Cannot load prometheus metrics" return something like "Unexpected data format, got multiple results instead of one"

Example of duplicated node-exporter metrics

node_memory_MemTotal_bytes{container="node-exporter", endpoint="http-metrics", instance="100.64.100.12:9100", job="node-exporter", kubernetes_node="gen8-vm", namespace="monitoring", pod="kube-prometheus-stack-prometheus-node-exporter-vvf99", service="kube-prometheus-stack-prometheus-node-exporter"} | 14315692032
-- | --
node_memory_MemTotal_bytes{instance="100.64.100.12:9100", job="kubernetes-service-endpoints", jobLabel="node-exporter", kubernetes_name="kube-prometheus-stack-prometheus-node-exporter", kubernetes_namespace="monitoring", release="kube-prometheus-stack"}

@SuperQ
Copy link
Member

SuperQ commented Nov 14, 2022

This is not a bug in the node_exporter.

@alekc
Copy link
Author

alekc commented Nov 14, 2022 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants