Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing pod metrics on minikube #464

Closed
jgillich opened this issue Feb 8, 2023 · 4 comments
Closed

Missing pod metrics on minikube #464

jgillich opened this issue Feb 8, 2023 · 4 comments

Comments

@jgillich
Copy link

jgillich commented Feb 8, 2023

I switched from k3d to minikube and I'm not getting any pod/container metrics anymore:

Screenshot from 2023-02-08 12-52-36

Using the latest version (0.14.7) of victoria-metrics-k8s-stack. Values are all default (apart from some grafana ingress config).

vmagent logs a continuous stream of connection refused and certificate errors:

2023-02-08T12:31:09.191Z	info	VictoriaMetrics/lib/promscrape/discovery/kubernetes/api_watcher.go:641	reloaded 0 objects from "https://10.96.0.1:443/api/v1/namespaces/default/pods" in 0.052s; updated=0, removed=0, added=0, resourceVersion="208599"
2023-02-08T12:31:09.191Z	info	VictoriaMetrics/lib/promscrape/discovery/kubernetes/api_watcher.go:509	started service watcher for "https://10.96.0.1:443/api/v1/namespaces/default/services"
2023-02-08T12:31:09.194Z	info	VictoriaMetrics/lib/promscrape/discovery/kubernetes/api_watcher.go:641	reloaded 1 objects from "https://10.96.0.1:443/api/v1/namespaces/default/services" in 0.003s; updated=0, removed=0, added=1, resourceVersion="208599"
2023-02-08T12:31:09.195Z	info	VictoriaMetrics/lib/promscrape/discovery/kubernetes/api_watcher.go:509	started endpoints watcher for "https://10.96.0.1:443/api/v1/namespaces/default/endpoints"
2023-02-08T12:31:09.197Z	info	VictoriaMetrics/lib/promscrape/discovery/kubernetes/api_watcher.go:641	reloaded 1 objects from "https://10.96.0.1:443/api/v1/namespaces/default/endpoints" in 0.003s; updated=0, removed=0, added=1, resourceVersion="208599"
2023-02-08T12:31:09.198Z	info	VictoriaMetrics/lib/promscrape/discovery/kubernetes/api_watcher.go:509	started pod watcher for "https://10.96.0.1:443/api/v1/namespaces/kube-system/pods"
2023-02-08T12:31:09.219Z	info	VictoriaMetrics/lib/promscrape/discovery/kubernetes/api_watcher.go:641	reloaded 13 objects from "https://10.96.0.1:443/api/v1/namespaces/kube-system/pods" in 0.021s; updated=0, removed=0, added=13, resourceVersion="208600"
2023-02-08T12:31:09.219Z	info	VictoriaMetrics/lib/promscrape/discovery/kubernetes/api_watcher.go:509	started service watcher for "https://10.96.0.1:443/api/v1/namespaces/kube-system/services"
2023-02-08T12:31:09.226Z	info	VictoriaMetrics/lib/promscrape/discovery/kubernetes/api_watcher.go:641	reloaded 10 objects from "https://10.96.0.1:443/api/v1/namespaces/kube-system/services" in 0.007s; updated=0, removed=0, added=10, resourceVersion="208600"
2023-02-08T12:31:09.226Z	info	VictoriaMetrics/lib/promscrape/discovery/kubernetes/api_watcher.go:509	started endpoints watcher for "https://10.96.0.1:443/api/v1/namespaces/kube-system/endpoints"
2023-02-08T12:31:09.294Z	info	VictoriaMetrics/lib/promscrape/discovery/kubernetes/api_watcher.go:641	reloaded 11 objects from "https://10.96.0.1:443/api/v1/namespaces/kube-system/endpoints" in 0.068s; updated=0, removed=0, added=11, resourceVersion="208600"
2023-02-08T12:31:09.294Z	info	VictoriaMetrics/lib/promscrape/discovery/kubernetes/api_watcher.go:509	started pod watcher for "https://10.96.0.1:443/api/v1/namespaces/cloudplane-system/pods"
2023-02-08T12:31:09.320Z	info	VictoriaMetrics/lib/promscrape/discovery/kubernetes/api_watcher.go:641	reloaded 31 objects from "https://10.96.0.1:443/api/v1/namespaces/cloudplane-system/pods" in 0.026s; updated=0, removed=0, added=31, resourceVersion="208600"
2023-02-08T12:31:09.320Z	info	VictoriaMetrics/lib/promscrape/discovery/kubernetes/api_watcher.go:509	started service watcher for "https://10.96.0.1:443/api/v1/namespaces/cloudplane-system/services"
2023-02-08T12:31:09.406Z	info	VictoriaMetrics/lib/promscrape/discovery/kubernetes/api_watcher.go:641	reloaded 24 objects from "https://10.96.0.1:443/api/v1/namespaces/cloudplane-system/services" in 0.085s; updated=0, removed=0, added=24, resourceVersion="208600"
2023-02-08T12:31:09.406Z	info	VictoriaMetrics/lib/promscrape/discovery/kubernetes/api_watcher.go:509	started endpoints watcher for "https://10.96.0.1:443/api/v1/namespaces/cloudplane-system/endpoints"
2023-02-08T12:31:09.421Z	info	VictoriaMetrics/lib/promscrape/discovery/kubernetes/api_watcher.go:641	reloaded 24 objects from "https://10.96.0.1:443/api/v1/namespaces/cloudplane-system/endpoints" in 0.015s; updated=0, removed=0, added=24, resourceVersion="208600"
2023-02-08T12:31:09.422Z	info	VictoriaMetrics/lib/promscrape/discovery/kubernetes/api_watcher.go:509	started node watcher for "https://10.96.0.1:443/api/v1/nodes"
2023-02-08T12:31:09.427Z	info	VictoriaMetrics/lib/promscrape/discovery/kubernetes/api_watcher.go:641	reloaded 1 objects from "https://10.96.0.1:443/api/v1/nodes" in 0.005s; updated=0, removed=0, added=1, resourceVersion="208600"
2023-02-08T12:31:09.492Z	info	VictoriaMetrics/lib/promscrape/config.go:126	started service discovery routines in 0.485 seconds
2023-02-08T12:31:09.607Z	info	VictoriaMetrics/lib/promscrape/scraper.go:421	kubernetes_sd_configs: added targets: 19, removed targets: 0; total targets: 19
2023-02-08T12:31:13.011Z	warn	VictoriaMetrics/lib/promscrape/scrapework.go:376	cannot scrape target "https://192.168.49.2:10251/metrics" ({endpoint="http-metrics",instance="192.168.49.2:10251",job="kube-scheduler",namespace="kube-system",pod="kube-scheduler-minikube",service="vm-victoria-metrics-k8s-stack-kube-scheduler"}) 1 out of 1 times during -promscrape.suppressScrapeErrorsDelay=0s; the last error: cannot read data: cannot scrape "https://192.168.49.2:10251/metrics": Get "https://192.168.49.2:10251/metrics": dial tcp4 192.168.49.2:10251: connect: connection refused; try -enableTCP6 command-line flag if you scrape ipv6 addresses
2023-02-08T12:31:13.603Z	info	VictoriaMetrics/lib/promscrape/scraper.go:150	SIGHUP received; reloading Prometheus configs from "/etc/vmagent/config_out/vmagent.env.yaml"
2023-02-08T12:31:13.622Z	info	VictoriaMetrics/app/vmagent/remotewrite/remotewrite.go:172	SIGHUP received; reloading relabel configs pointed by -remoteWrite.relabelConfig and -remoteWrite.urlRelabelConfig
2023-02-08T12:31:13.622Z	info	VictoriaMetrics/app/vmagent/remotewrite/remotewrite.go:184	Successfully reloaded relabel configs
2023-02-08T12:31:13.800Z	info	VictoriaMetrics/lib/promscrape/scraper.go:159	nothing changed in "/etc/vmagent/config_out/vmagent.env.yaml"
2023-02-08T12:31:26.373Z	warn	VictoriaMetrics/lib/promscrape/scrapework.go:376	cannot scrape target "https://192.168.49.2:10257/metrics" ({endpoint="http-metrics",instance="192.168.49.2:10257",job="kube-controller-manager",namespace="kube-system",pod="kube-controller-manager-minikube",service="vm-victoria-metrics-k8s-stack-kube-controller-manager"}) 1 out of 1 times during -promscrape.suppressScrapeErrorsDelay=0s; the last error: cannot read data: cannot scrape "https://192.168.49.2:10257/metrics": Get "https://192.168.49.2:10257/metrics": dial tcp4 192.168.49.2:10257: connect: connection refused; try -enableTCP6 command-line flag if you scrape ipv6 addresses
2023-02-08T12:31:29.703Z	warn	VictoriaMetrics/lib/promscrape/scrapework.go:376	cannot scrape target "https://192.168.49.2:2379/metrics" ({endpoint="http-metrics",instance="192.168.49.2:2379",job="kube-etcd",namespace="kube-system",pod="etcd-minikube",service="vm-victoria-metrics-k8s-stack-kube-etcd"}) 1 out of 1 times during -promscrape.suppressScrapeErrorsDelay=0s; the last error: cannot read data: cannot scrape "https://192.168.49.2:2379/metrics": Get "https://192.168.49.2:2379/metrics": x509: certificate signed by unknown authority
2023-02-08T12:31:38.012Z	warn	VictoriaMetrics/lib/promscrape/scrapework.go:376	cannot scrape target "https://192.168.49.2:10251/metrics" ({endpoint="http-metrics",instance="192.168.49.2:10251",job="kube-scheduler",namespace="kube-system",pod="kube-scheduler-minikube",service="vm-victoria-metrics-k8s-stack-kube-scheduler"}) 1 out of 1 times during -promscrape.suppressScrapeErrorsDelay=0s; the last error: cannot read data: cannot scrape "https://192.168.49.2:10251/metrics": Get "https://192.168.49.2:10251/metrics": dial tcp4 192.168.49.2:10251: connect: connection refused; try -enableTCP6 command-line flag if you scrape ipv6 addresses
2023-02-08T12:31:39.493Z	info	VictoriaMetrics/lib/promscrape/scraper.go:421	kubernetes_sd_configs: added targets: 1, removed targets: 0; total targets: 20
2023-02-08T12:31:51.373Z	warn	VictoriaMetrics/lib/promscrape/scrapework.go:376	cannot scrape target "https://192.168.49.2:10257/metrics" ({endpoint="http-metrics",instance="192.168.49.2:10257",job="kube-controller-manager",namespace="kube-system",pod="kube-controller-manager-minikube",service="vm-victoria-metrics-k8s-stack-kube-controller-manager"}) 1 out of 1 times during -promscrape.suppressScrapeErrorsDelay=0s; the last error: cannot read data: cannot scrape "https://192.168.49.2:10257/metrics": Get "https://192.168.49.2:10257/metrics": dial tcp4 192.168.49.2:10257: connect: connection refused; try -enableTCP6 command-line flag if you scrape ipv6 addresses
2023-02-08T12:31:54.608Z	warn	VictoriaMetrics/lib/promscrape/scrapework.go:376	cannot scrape target "https://192.168.49.2:2379/metrics" ({endpoint="http-metrics",instance="192.168.49.2:2379",job="kube-etcd",namespace="kube-system",pod="etcd-minikube",service="vm-victoria-metrics-k8s-stack-kube-etcd"}) 1 out of 1 times during -promscrape.suppressScrapeErrorsDelay=0s; the last error: cannot read data: cannot scrape "https://192.168.49.2:2379/metrics": Get "https://192.168.49.2:2379/metrics": x509: certificate signed by unknown authority
2023-02-08T12:32:03.012Z	warn	VictoriaMetrics/lib/promscrape/scrapework.go:376	cannot scrape target "https://192.168.49.2:10251/metrics" ({endpoint="http-metrics",instance="192.168.49.2:10251",job="kube-scheduler",namespace="kube-system",pod="kube-scheduler-minikube",service="vm-victoria-metrics-k8s-stack-kube-scheduler"}) 1 out of 1 times during -promscrape.suppressScrapeErrorsDelay=0s; the last error: cannot read data: cannot scrape "https://192.168.49.2:10251/metrics": Get "https://192.168.49.2:10251/metrics": dial tcp4 192.168.49.2:10251: connect: connection refused; try -enableTCP6 command-line flag if you scrape ipv6 addresses
2023-02-08T12:32:16.373Z	warn	VictoriaMetrics/lib/promscrape/scrapework.go:376	cannot scrape target "https://192.168.49.2:10257/metrics" ({endpoint="http-metrics",instance="192.168.49.2:10257",job="kube-controller-manager",namespace="kube-system",pod="kube-controller-manager-minikube",service="vm-victoria-metrics-k8s-stack-kube-controller-manager"}) 1 out of 1 times during -promscrape.suppressScrapeErrorsDelay=0s; the last error: cannot read data: cannot scrape "https://192.168.49.2:10257/metrics": Get "https://192.168.49.2:10257/metrics": dial tcp4 192.168.49.2:10257: connect: connection refused; try -enableTCP6 command-line flag if you scrape ipv6 addresses
2023-02-08T12:32:19.607Z	warn	VictoriaMetrics/lib/promscrape/scrapework.go:376	cannot scrape target "https://192.168.49.2:2379/metrics" ({endpoint="http-metrics",instance="192.168.49.2:2379",job="kube-etcd",namespace="kube-system",pod="etcd-minikube",service="vm-victoria-metrics-k8s-stack-kube-etcd"}) 1 out of 1 times during -promscrape.suppressScrapeErrorsDelay=0s; the last error: cannot read data: cannot scrape "https://192.168.49.2:2379/metrics": Get "https://192.168.49.2:2379/metrics": x509: certificate signed by unknown authority
2023-02-08T12:32:28.013Z	warn	VictoriaMetrics/lib/promscrape/scrapework.go:376	cannot scrape target "https://192.168.49.2:10251/metrics" ({endpoint="http-metrics",instance="192.168.49.2:10251",job="kube-scheduler",namespace="kube-system",pod="kube-scheduler-minikube",service="vm-victoria-metrics-k8s-stack-kube-scheduler"}) 1 out of 1 times during -promscrape.suppressScrapeErrorsDelay=0s; the last error: cannot read data: cannot scrape "https://192.168.49.2:10251/metrics": Get "https://192.168.49.2:10251/metrics": dial tcp4 192.168.49.2:10251: connect: connection refused; try -enableTCP6 command-line flag if you scrape ipv6 addresses

This is the node IP, so I believe this is where the issue is.

But kube-controller-manager and scheduler are running:

kube-controller-manager
apiVersion: v1
kind: Pod
metadata:
  annotations:
    kubernetes.io/config.hash: 5175bba984ed52052d891b5a45b584b6
    kubernetes.io/config.mirror: 5175bba984ed52052d891b5a45b584b6
    kubernetes.io/config.seen: "2023-02-08T13:14:29.098673936Z"
    kubernetes.io/config.source: file
  creationTimestamp: "2023-02-08T13:14:38Z"
  labels:
    component: kube-controller-manager
    tier: control-plane
  name: kube-controller-manager-minikube
  namespace: kube-system
  ownerReferences:
  - apiVersion: v1
    controller: true
    kind: Node
    name: minikube
    uid: dfc14ddb-7de1-422c-b322-fd1c6d6a20b2
  resourceVersion: "304"
  uid: b72b3ae4-1827-41be-8441-cfcf43499e48
spec:
  containers:
  - command:
    - kube-controller-manager
    - --allocate-node-cidrs=true
    - --authentication-kubeconfig=/etc/kubernetes/controller-manager.conf
    - --authorization-kubeconfig=/etc/kubernetes/controller-manager.conf
    - --bind-address=127.0.0.1
    - --client-ca-file=/var/lib/minikube/certs/ca.crt
    - --cluster-cidr=10.244.0.0/16
    - --cluster-name=mk
    - --cluster-signing-cert-file=/var/lib/minikube/certs/ca.crt
    - --cluster-signing-key-file=/var/lib/minikube/certs/ca.key
    - --controllers=*,bootstrapsigner,tokencleaner
    - --kubeconfig=/etc/kubernetes/controller-manager.conf
    - --leader-elect=false
    - --requestheader-client-ca-file=/var/lib/minikube/certs/front-proxy-ca.crt
    - --root-ca-file=/var/lib/minikube/certs/ca.crt
    - --service-account-private-key-file=/var/lib/minikube/certs/sa.key
    - --service-cluster-ip-range=10.96.0.0/12
    - --use-service-account-credentials=true
    image: registry.k8s.io/kube-controller-manager:v1.26.1
    imagePullPolicy: IfNotPresent
    livenessProbe:
      failureThreshold: 8
      httpGet:
        host: 127.0.0.1
        path: /healthz
        port: 10257
        scheme: HTTPS
      initialDelaySeconds: 10
      periodSeconds: 10
      successThreshold: 1
      timeoutSeconds: 15
    name: kube-controller-manager
    resources:
      requests:
        cpu: 200m
    startupProbe:
      failureThreshold: 24
      httpGet:
        host: 127.0.0.1
        path: /healthz
        port: 10257
        scheme: HTTPS
      initialDelaySeconds: 10
      periodSeconds: 10
      successThreshold: 1
      timeoutSeconds: 15
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /etc/ssl/certs
      name: ca-certs
      readOnly: true
    - mountPath: /etc/ca-certificates
      name: etc-ca-certificates
      readOnly: true
    - mountPath: /usr/libexec/kubernetes/kubelet-plugins/volume/exec
      name: flexvolume-dir
    - mountPath: /var/lib/minikube/certs
      name: k8s-certs
      readOnly: true
    - mountPath: /etc/kubernetes/controller-manager.conf
      name: kubeconfig
      readOnly: true
    - mountPath: /usr/local/share/ca-certificates
      name: usr-local-share-ca-certificates
      readOnly: true
    - mountPath: /usr/share/ca-certificates
      name: usr-share-ca-certificates
      readOnly: true
  dnsPolicy: ClusterFirst
  enableServiceLinks: true
  hostNetwork: true
  nodeName: minikube
  preemptionPolicy: PreemptLowerPriority
  priority: 2000001000
  priorityClassName: system-node-critical
  restartPolicy: Always
  schedulerName: default-scheduler
  securityContext:
    seccompProfile:
      type: RuntimeDefault
  terminationGracePeriodSeconds: 30
  tolerations:
  - effect: NoExecute
    operator: Exists
  volumes:
  - hostPath:
      path: /etc/ssl/certs
      type: DirectoryOrCreate
    name: ca-certs
  - hostPath:
      path: /etc/ca-certificates
      type: DirectoryOrCreate
    name: etc-ca-certificates
  - hostPath:
      path: /usr/libexec/kubernetes/kubelet-plugins/volume/exec
      type: DirectoryOrCreate
    name: flexvolume-dir
  - hostPath:
      path: /var/lib/minikube/certs
      type: DirectoryOrCreate
    name: k8s-certs
  - hostPath:
      path: /etc/kubernetes/controller-manager.conf
      type: FileOrCreate
    name: kubeconfig
  - hostPath:
      path: /usr/local/share/ca-certificates
      type: DirectoryOrCreate
    name: usr-local-share-ca-certificates
  - hostPath:
      path: /usr/share/ca-certificates
      type: DirectoryOrCreate
    name: usr-share-ca-certificates
status:
  conditions:
  - lastProbeTime: null
    lastTransitionTime: "2023-02-08T13:14:41Z"
    status: "True"
    type: Initialized
  - lastProbeTime: null
    lastTransitionTime: "2023-02-08T13:14:46Z"
    status: "True"
    type: Ready
  - lastProbeTime: null
    lastTransitionTime: "2023-02-08T13:14:46Z"
    status: "True"
    type: ContainersReady
  - lastProbeTime: null
    lastTransitionTime: "2023-02-08T13:14:41Z"
    status: "True"
    type: PodScheduled
  containerStatuses:
  - containerID: docker://28fc56170b1afaf5e0e02782f93d0bfc5156131027d8c66b346b9ed25c352260
    image: registry.k8s.io/kube-controller-manager:v1.26.1
    imageID: docker://sha256:e9c08e11b07f68c1805c49e4ce66e5a9e6b2d59f6f65041c113b635095a7ad8d
    lastState: {}
    name: kube-controller-manager
    ready: true
    restartCount: 0
    started: true
    state:
      running:
        startedAt: "2023-02-08T13:14:33Z"
  hostIP: 192.168.49.2
  phase: Running
  podIP: 192.168.49.2
  podIPs:
  - ip: 192.168.49.2
  qosClass: Burstable
  startTime: "2023-02-08T13:14:41Z"
kube-scheduler
apiVersion: v1
kind: Pod
metadata:
  annotations:
    kubernetes.io/config.hash: 197cd0de602d7cb722d0bd2daf878121
    kubernetes.io/config.mirror: 197cd0de602d7cb722d0bd2daf878121
    kubernetes.io/config.seen: "2023-02-08T13:14:40.216056776Z"
    kubernetes.io/config.source: file
  creationTimestamp: "2023-02-08T13:14:40Z"
  labels:
    component: kube-scheduler
    tier: control-plane
  name: kube-scheduler-minikube
  namespace: kube-system
  ownerReferences:
  - apiVersion: v1
    controller: true
    kind: Node
    name: minikube
    uid: dfc14ddb-7de1-422c-b322-fd1c6d6a20b2
  resourceVersion: "330"
  uid: 96d983ae-2595-43cd-97d3-b76355fbb412
spec:
  containers:
  - command:
    - kube-scheduler
    - --authentication-kubeconfig=/etc/kubernetes/scheduler.conf
    - --authorization-kubeconfig=/etc/kubernetes/scheduler.conf
    - --bind-address=127.0.0.1
    - --kubeconfig=/etc/kubernetes/scheduler.conf
    - --leader-elect=false
    image: registry.k8s.io/kube-scheduler:v1.26.1
    imagePullPolicy: IfNotPresent
    livenessProbe:
      failureThreshold: 8
      httpGet:
        host: 127.0.0.1
        path: /healthz
        port: 10259
        scheme: HTTPS
      initialDelaySeconds: 10
      periodSeconds: 10
      successThreshold: 1
      timeoutSeconds: 15
    name: kube-scheduler
    resources:
      requests:
        cpu: 100m
    startupProbe:
      failureThreshold: 24
      httpGet:
        host: 127.0.0.1
        path: /healthz
        port: 10259
        scheme: HTTPS
      initialDelaySeconds: 10
      periodSeconds: 10
      successThreshold: 1
      timeoutSeconds: 15
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /etc/kubernetes/scheduler.conf
      name: kubeconfig
      readOnly: true
  dnsPolicy: ClusterFirst
  enableServiceLinks: true
  hostNetwork: true
  nodeName: minikube
  preemptionPolicy: PreemptLowerPriority
  priority: 2000001000
  priorityClassName: system-node-critical
  restartPolicy: Always
  schedulerName: default-scheduler
  securityContext:
    seccompProfile:
      type: RuntimeDefault
  terminationGracePeriodSeconds: 30
  tolerations:
  - effect: NoExecute
    operator: Exists
  volumes:
  - hostPath:
      path: /etc/kubernetes/scheduler.conf
      type: FileOrCreate
    name: kubeconfig
status:
  conditions:
  - lastProbeTime: null
    lastTransitionTime: "2023-02-08T13:14:40Z"
    status: "True"
    type: Initialized
  - lastProbeTime: null
    lastTransitionTime: "2023-02-08T13:14:51Z"
    status: "True"
    type: Ready
  - lastProbeTime: null
    lastTransitionTime: "2023-02-08T13:14:51Z"
    status: "True"
    type: ContainersReady
  - lastProbeTime: null
    lastTransitionTime: "2023-02-08T13:14:40Z"
    status: "True"
    type: PodScheduled
  containerStatuses:
  - containerID: docker://8ebcf1f68bd9b326171a58a7e3cf1dc142729ff9505b5fe1223dce15549c80f8
    image: registry.k8s.io/kube-scheduler:v1.26.1
    imageID: docker://sha256:655493523f6076092624c06fd5facf9541a9b3d54e6f3bf5a6e078ee7b1ba44f
    lastState: {}
    name: kube-scheduler
    ready: true
    restartCount: 0
    started: true
    state:
      running:
        startedAt: "2023-02-08T13:14:33Z"
  hostIP: 192.168.49.2
  phase: Running
  podIP: 192.168.49.2
  podIPs:
  - ip: 192.168.49.2
  qosClass: Burstable
  startTime: "2023-02-08T13:14:40Z"

It is possible to ping the node from within the cluster:

# ping 192.168.49.2
PING 192.168.49.2 (192.168.49.2) 56(84) bytes of data.
64 bytes from 192.168.49.2: icmp_seq=1 ttl=64 time=0.102 ms

But the port can't be reached:

# curl 192.168.49.2:10257
curl: (7) Failed to connect to 192.168.49.2 port 10257: Connection refused

Any idea what the issue might be here? I can get metrics with prometheus installed via Lens, but perhaps it uses a different way to scrape the node...

@dmitryk-dk
Copy link
Contributor

dmitryk-dk commented Feb 8, 2023

Hi @jgillich ! Can you change from - --bind-address=127.0.0.1 to - --bind-address=0.0.0.0?
Also can you change the ports for the scheduler and controller to

## If using kubeScheduler.endpoints, only the port and targetPort are used
  ##
  service:
    enabled: true
    port: 10251
    targetPort: 10251
    # selector:
    #   component: kube-scheduler

@jgillich
Copy link
Author

jgillich commented Feb 8, 2023

Aha! That must be the reason for these errors. But it looks like there is no way to change the bind-address (kubernetes/kubeadm#2388). And, correct me if I'm wrong, but they don't seem to be required for container metrics anyway. So I've disabled kubeScheduler, kubeControllerManager and kubeEtcd to get rid of the errors.

Furthermore, I've realized that minikube doesn't install metrics-server by default, so I've added it. That's where container metrics are scraped from, correct?

vmagent log is now free of errors:

2023-02-08T16:10:54.667Z	info	VictoriaMetrics/lib/logger/flag.go:12	build version: vmagent-20230201-221241-tags-v1.87.0-0-gfe736c538
2023-02-08T16:10:54.667Z	info	VictoriaMetrics/lib/logger/flag.go:13	command-line flags
2023-02-08T16:10:54.667Z	info	VictoriaMetrics/lib/logger/flag.go:20	  -httpListenAddr=":8429"
2023-02-08T16:10:54.667Z	info	VictoriaMetrics/lib/logger/flag.go:20	  -promscrape.config="/etc/vmagent/config_out/vmagent.env.yaml"
2023-02-08T16:10:54.667Z	info	VictoriaMetrics/lib/logger/flag.go:20	  -promscrape.streamParse="true"
2023-02-08T16:10:54.667Z	info	VictoriaMetrics/lib/logger/flag.go:20	  -remoteWrite.maxDiskUsagePerURL="1073741824"
2023-02-08T16:10:54.667Z	info	VictoriaMetrics/lib/logger/flag.go:20	  -remoteWrite.tmpDataPath="/tmp/vmagent-remotewrite-data"
2023-02-08T16:10:54.667Z	info	VictoriaMetrics/lib/logger/flag.go:20	  -remoteWrite.url="secret"
2023-02-08T16:10:54.667Z	info	VictoriaMetrics/app/vmagent/main.go:115	starting vmagent at ":8429"...
2023-02-08T16:10:54.668Z	info	VictoriaMetrics/lib/memory/memory.go:42	limiting caches to 314572800 bytes, leaving 209715200 bytes to the OS according to -memory.allowedPercent=60
2023-02-08T16:10:54.695Z	info	VictoriaMetrics/lib/persistentqueue/fastqueue.go:59	opened fast persistent queue at "/tmp/vmagent-remotewrite-data/persistent-queue/1_AB5EF4175B7B674D" with maxInmemoryBlocks=200, it contains 0 pending bytes
2023-02-08T16:10:54.695Z	info	VictoriaMetrics/app/vmagent/remotewrite/client.go:169	initialized client for -remoteWrite.url="1:secret-url"
2023-02-08T16:10:54.696Z	info	VictoriaMetrics/app/vmagent/main.go:141	started vmagent in 0.028 seconds
2023-02-08T16:10:54.696Z	info	VictoriaMetrics/lib/promscrape/scraper.go:109	reading Prometheus configs from "/etc/vmagent/config_out/vmagent.env.yaml"
2023-02-08T16:10:54.706Z	info	VictoriaMetrics/lib/httpserver/httpserver.go:99	starting http server at http://127.0.0.1:8429/
2023-02-08T16:10:54.706Z	info	VictoriaMetrics/lib/httpserver/httpserver.go:100	pprof handlers are exposed at http://127.0.0.1:8429/debug/pprof/
2023-02-08T16:10:54.794Z	info	VictoriaMetrics/lib/promscrape/config.go:120	starting service discovery routines...
2023-02-08T16:10:54.795Z	info	VictoriaMetrics/lib/promscrape/discovery/kubernetes/api_watcher.go:509	started pod watcher for "https://10.96.0.1:443/api/v1/pods"
2023-02-08T16:10:54.895Z	info	VictoriaMetrics/lib/promscrape/discovery/kubernetes/api_watcher.go:641	reloaded 59 objects from "https://10.96.0.1:443/api/v1/pods" in 0.100s; updated=0, removed=0, added=59, resourceVersion="74712"
2023-02-08T16:10:54.895Z	info	VictoriaMetrics/lib/promscrape/discovery/kubernetes/api_watcher.go:509	started service watcher for "https://10.96.0.1:443/api/v1/services"
2023-02-08T16:10:54.905Z	info	VictoriaMetrics/lib/promscrape/discovery/kubernetes/api_watcher.go:641	reloaded 49 objects from "https://10.96.0.1:443/api/v1/services" in 0.010s; updated=0, removed=0, added=49, resourceVersion="74713"
2023-02-08T16:10:54.905Z	info	VictoriaMetrics/lib/promscrape/discovery/kubernetes/api_watcher.go:509	started endpoints watcher for "https://10.96.0.1:443/api/v1/endpoints"
2023-02-08T16:10:54.915Z	info	VictoriaMetrics/lib/promscrape/discovery/kubernetes/api_watcher.go:641	reloaded 50 objects from "https://10.96.0.1:443/api/v1/endpoints" in 0.010s; updated=0, removed=0, added=50, resourceVersion="74713"
2023-02-08T16:10:54.915Z	info	VictoriaMetrics/lib/promscrape/discovery/kubernetes/api_watcher.go:509	started pod watcher for "https://10.96.0.1:443/api/v1/namespaces/default/pods"
2023-02-08T16:10:54.994Z	info	VictoriaMetrics/lib/promscrape/discovery/kubernetes/api_watcher.go:641	reloaded 0 objects from "https://10.96.0.1:443/api/v1/namespaces/default/pods" in 0.079s; updated=0, removed=0, added=0, resourceVersion="74713"
2023-02-08T16:10:54.994Z	info	VictoriaMetrics/lib/promscrape/discovery/kubernetes/api_watcher.go:509	started service watcher for "https://10.96.0.1:443/api/v1/namespaces/default/services"
2023-02-08T16:10:54.998Z	info	VictoriaMetrics/lib/promscrape/discovery/kubernetes/api_watcher.go:641	reloaded 1 objects from "https://10.96.0.1:443/api/v1/namespaces/default/services" in 0.003s; updated=0, removed=0, added=1, resourceVersion="74713"
2023-02-08T16:10:54.998Z	info	VictoriaMetrics/lib/promscrape/discovery/kubernetes/api_watcher.go:509	started endpoints watcher for "https://10.96.0.1:443/api/v1/namespaces/default/endpoints"
2023-02-08T16:10:55.001Z	info	VictoriaMetrics/lib/promscrape/discovery/kubernetes/api_watcher.go:641	reloaded 1 objects from "https://10.96.0.1:443/api/v1/namespaces/default/endpoints" in 0.003s; updated=0, removed=0, added=1, resourceVersion="74713"
2023-02-08T16:10:55.001Z	info	VictoriaMetrics/lib/promscrape/discovery/kubernetes/api_watcher.go:509	started pod watcher for "https://10.96.0.1:443/api/v1/namespaces/kube-system/pods"
2023-02-08T16:10:55.015Z	info	VictoriaMetrics/lib/promscrape/discovery/kubernetes/api_watcher.go:641	reloaded 14 objects from "https://10.96.0.1:443/api/v1/namespaces/kube-system/pods" in 0.014s; updated=0, removed=0, added=14, resourceVersion="74714"
2023-02-08T16:10:55.015Z	info	VictoriaMetrics/lib/promscrape/discovery/kubernetes/api_watcher.go:509	started service watcher for "https://10.96.0.1:443/api/v1/namespaces/kube-system/services"
2023-02-08T16:10:55.021Z	info	VictoriaMetrics/lib/promscrape/discovery/kubernetes/api_watcher.go:641	reloaded 8 objects from "https://10.96.0.1:443/api/v1/namespaces/kube-system/services" in 0.005s; updated=0, removed=0, added=8, resourceVersion="74714"
2023-02-08T16:10:55.021Z	info	VictoriaMetrics/lib/promscrape/discovery/kubernetes/api_watcher.go:509	started endpoints watcher for "https://10.96.0.1:443/api/v1/namespaces/kube-system/endpoints"
2023-02-08T16:10:55.026Z	info	VictoriaMetrics/lib/promscrape/discovery/kubernetes/api_watcher.go:641	reloaded 9 objects from "https://10.96.0.1:443/api/v1/namespaces/kube-system/endpoints" in 0.005s; updated=0, removed=0, added=9, resourceVersion="74714"
2023-02-08T16:10:55.026Z	info	VictoriaMetrics/lib/promscrape/discovery/kubernetes/api_watcher.go:509	started pod watcher for "https://10.96.0.1:443/api/v1/namespaces/cloudplane-system/pods"
2023-02-08T16:10:55.110Z	info	VictoriaMetrics/lib/promscrape/discovery/kubernetes/api_watcher.go:641	reloaded 32 objects from "https://10.96.0.1:443/api/v1/namespaces/cloudplane-system/pods" in 0.084s; updated=0, removed=0, added=32, resourceVersion="74714"
2023-02-08T16:10:55.110Z	info	VictoriaMetrics/lib/promscrape/discovery/kubernetes/api_watcher.go:509	started service watcher for "https://10.96.0.1:443/api/v1/namespaces/cloudplane-system/services"
2023-02-08T16:10:55.117Z	info	VictoriaMetrics/lib/promscrape/discovery/kubernetes/api_watcher.go:641	reloaded 25 objects from "https://10.96.0.1:443/api/v1/namespaces/cloudplane-system/services" in 0.007s; updated=0, removed=0, added=25, resourceVersion="74714"
2023-02-08T16:10:55.117Z	info	VictoriaMetrics/lib/promscrape/discovery/kubernetes/api_watcher.go:509	started endpoints watcher for "https://10.96.0.1:443/api/v1/namespaces/cloudplane-system/endpoints"
2023-02-08T16:10:55.125Z	info	VictoriaMetrics/lib/promscrape/discovery/kubernetes/api_watcher.go:641	reloaded 25 objects from "https://10.96.0.1:443/api/v1/namespaces/cloudplane-system/endpoints" in 0.008s; updated=0, removed=0, added=25, resourceVersion="74714"
2023-02-08T16:10:55.126Z	info	VictoriaMetrics/lib/promscrape/discovery/kubernetes/api_watcher.go:509	started node watcher for "https://10.96.0.1:443/api/v1/nodes"
2023-02-08T16:10:55.131Z	info	VictoriaMetrics/lib/promscrape/discovery/kubernetes/api_watcher.go:641	reloaded 1 objects from "https://10.96.0.1:443/api/v1/nodes" in 0.005s; updated=0, removed=0, added=1, resourceVersion="74714"
2023-02-08T16:10:55.192Z	info	VictoriaMetrics/lib/promscrape/config.go:126	started service discovery routines in 0.398 seconds
2023-02-08T16:10:55.400Z	info	VictoriaMetrics/lib/promscrape/scraper.go:421	kubernetes_sd_configs: added targets: 16, removed targets: 0; total targets: 16
2023-02-08T16:10:59.404Z	info	VictoriaMetrics/app/vmagent/remotewrite/remotewrite.go:172	SIGHUP received; reloading relabel configs pointed by -remoteWrite.relabelConfig and -remoteWrite.urlRelabelConfig
2023-02-08T16:10:59.404Z	info	VictoriaMetrics/app/vmagent/remotewrite/remotewrite.go:184	Successfully reloaded relabel configs
2023-02-08T16:10:59.404Z	info	VictoriaMetrics/lib/promscrape/scraper.go:150	SIGHUP received; reloading Prometheus configs from "/etc/vmagent/config_out/vmagent.env.yaml"
2023-02-08T16:10:59.508Z	info	VictoriaMetrics/lib/promscrape/scraper.go:159	nothing changed in "/etc/vmagent/config_out/vmagent.env.yaml"
2023-02-08T16:11:25.198Z	info	VictoriaMetrics/lib/promscrape/scraper.go:421	kubernetes_sd_configs: added targets: 1, removed targets: 0; total targets: 17

But I still get no container metrics at all :(. There are also no alerts that indicate a problem. Is there any way I can check that kubelet scraping is working?

@zekker6
Copy link
Contributor

zekker6 commented Feb 9, 2023

@jgillich Those metrics are scraped from cadvisor.

This issue mentiones that minikube cadvisor endpoint is different from one chart uses by default. Could you check if using relabeling rules to from this issue helps?

@jgillich
Copy link
Author

jgillich commented Feb 9, 2023

Thanks, that got me on the right path! The issue is missing container labels, described here: rancher/rancher#38934 (comment)

Solutions include relabeling, using a different container runtime (minikube --container-runtime containerd) or downgrading to K8s 1.23.

@jgillich jgillich closed this as completed Feb 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants