serviceMonitor/default/kube-prometheus-stack-kube-etcd detect Incorrect metric port and connection reset by peer #1005

katepangLiu · 2021-05-27T08:39:47Z

Describe the bug

http://192.168.71.136:2379/metrics	DOWN	endpoint="http-metrics"instance="192.168.71.136:2379"job="kube-etcd"namespace="kube-system"pod="etcd-mickey.katepang"service="kube-prometheus-stack-kube-etcd"	1m 23s ago	0.986ms	Get "http://192.168.71.136:2379/metrics": read tcp 100.64.97.116:40596->192.168.71.136:2379: read: connection reset by peer

in etcd.yaml , listen-metrics-urls=http://0.0.0.0:2381
but int serviceMonitor/default/kube-prometheus-stack-kube-etcd, Port: http-metrics 2379/TCP
Do curl 2381, I can get the metrics

[root@mickey ~]# cat /etc/kubernetes/manifests/etcd.yaml 
apiVersion: v1
kind: Pod
metadata:
  annotations:
    kubeadm.kubernetes.io/etcd.advertise-client-urls: https://192.168.71.136:2379
  creationTimestamp: null
  labels:
    component: etcd
    tier: control-plane
  name: etcd
  namespace: kube-system
spec:
  containers:
  - command:
    - etcd
    - --advertise-client-urls=https://192.168.71.136:2379
    - --cert-file=/etc/kubernetes/pki/etcd/server.crt
    - --client-cert-auth=true
    - --data-dir=/var/lib/etcd
    - --initial-advertise-peer-urls=https://192.168.71.136:2380
    - --initial-cluster=mickey.katepang=https://192.168.71.136:2380
    - --key-file=/etc/kubernetes/pki/etcd/server.key
    - --listen-client-urls=https://127.0.0.1:2379,https://192.168.71.136:2379
    - --listen-metrics-urls=http://0.0.0.0:2381

[root@mickey ~]# kubectl describe service/kube-prometheus-stack-kube-etcd -n kube-system
Name:              kube-prometheus-stack-kube-etcd
Namespace:         kube-system
Labels:            app=kube-prometheus-stack-kube-etcd
                   app.kubernetes.io/instance=kube-prometheus-stack
                   app.kubernetes.io/managed-by=Helm
                   app.kubernetes.io/part-of=kube-prometheus-stack
                   app.kubernetes.io/version=16.0.1
                   chart=kube-prometheus-stack-16.0.1
                   heritage=Helm
                   jobLabel=kube-etcd
                   release=kube-prometheus-stack
Annotations:       meta.helm.sh/release-name: kube-prometheus-stack
                   meta.helm.sh/release-namespace: default
Selector:          component=etcd
Type:              ClusterIP
IP:                None
Port:              http-metrics  2379/TCP
TargetPort:        2379/TCP
Endpoints:         192.168.71.136:2379
Session Affinity:  None
Events:            <none>

Version of Helm and Kubernetes:

Helm Version:

$ helm version
version.BuildInfo{Version:"v3.5.4", GitCommit:"1b5edb69df3d3a08df77c9902dc17af864ff05d1", GitTreeState:"clean", GoVersion:"go1.15.11"}

Kubernetes Version:

$ kubectl version
Client Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.9", GitCommit:"9dd794e454ac32d97cde41ae10be801ae98f75df", GitTreeState:"clean", BuildDate:"2021-03-18T01:09:28Z", GoVersion:"go1.15.8", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.9", GitCommit:"9dd794e454ac32d97cde41ae10be801ae98f75df", GitTreeState:"clean", BuildDate:"2021-03-18T01:00:06Z", GoVersion:"go1.15.8", Compiler:"gc", Platform:"linux/amd64"

Which chart:kube-prometheus-stack

Which version of the chart:

What happened:
scrapt port 2379 as etcd metrics target

What you expected to happen:
scrapt port 2381 as etcd metrics target

The helm command that you execute and failing/misfunctioning:

For example:

helm install kube-prometheus-stack prometheus-community/kube-prometheus-stack

Helm values set after installation/upgrade:

helm get values my-release
USER-SUPPLIED VALUES:
null

Anything else we need to know:

The text was updated successfully, but these errors were encountered:

bohehe · 2021-06-14T07:53:07Z

Got same issue, any fix on this?

kubectl version

Client Version: version.Info{Major:"1", Minor:"16", GitVersion:"v1.16.15", GitCommit:"2adc8d7091e89b6e3ca8d048140618ec89b39369", GitTreeState:"clean", BuildDate:"2020-09-02T11:40:00Z", GoVersion:"go1.13.15", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"16", GitVersion:"v1.16.15", GitCommit:"2adc8d7091e89b6e3ca8d048140618ec89b39369", GitTreeState:"clean", BuildDate:"2020-09-02T11:31:21Z", GoVersion:"go1.13.15", Compiler:"gc", Platform:"linux/amd64"}

helm kube-prometheus-stack version: 16.5.0

bohehe · 2021-06-15T03:27:01Z

Yesterday I found a workaround here : #204 (comment)

darrenwatt · 2021-06-16T09:21:37Z

Isn't the more elegant solution just to update the helm values to use the http endpoint instead?

kubeEtcd: service: enabled: true port: 2381 targetPort: 2381

stale · 2021-07-16T09:36:44Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Any further update will cause the issue/pull request to no longer be considered stale. Thank you for your contributions.

stale · 2021-07-30T18:06:53Z

This issue is being automatically closed due to inactivity.

kingdonb · 2022-01-17T20:41:50Z

I ran into this issue, it seems like the default etcd configuration from kubeadm does not want to be monitored by prometheus.

I opened it up with what seemed like the obvious config change in /etc/kubernetes/manifests/etcd.yaml, adding a listen address on my public interface for the metrics address, instead of only listening on 127.0.0.1, and found that kube-prometheus-stack was looking for the metrics on 2379, when they are served on 2381 only.

It's not straightforward configuring kube-prometheus-stack with alertmanager out of the box partially for this reason, as I've chronicled in: #812 (comment)

It is possible that setting kubeEtcd port and targetport to 2381 by default would have made it more straightforward, but it's my considered opinion that more fixes are needed and this change by itself isn't going to solve anything.

Still, thanks @darrenwatt for the hint I needed that this port has a configuration option in the chart values! 👍

#1005 (comment)

fanux · 2022-04-28T13:21:08Z

labring/sealos#960

katepangLiu added the bug Something isn't working label May 27, 2021

stale bot added the lifecycle/stale label Jul 16, 2021

stale bot closed this as completed Jul 30, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

serviceMonitor/default/kube-prometheus-stack-kube-etcd detect Incorrect metric port and connection reset by peer #1005

serviceMonitor/default/kube-prometheus-stack-kube-etcd detect Incorrect metric port and connection reset by peer #1005

katepangLiu commented May 27, 2021

bohehe commented Jun 14, 2021

bohehe commented Jun 15, 2021 •

edited

Loading

darrenwatt commented Jun 16, 2021 •

edited

Loading

stale bot commented Jul 16, 2021

stale bot commented Jul 30, 2021

kingdonb commented Jan 17, 2022

fanux commented Apr 28, 2022

serviceMonitor/default/kube-prometheus-stack-kube-etcd detect Incorrect metric port and connection reset by peer #1005

serviceMonitor/default/kube-prometheus-stack-kube-etcd detect Incorrect metric port and connection reset by peer #1005

Comments

katepangLiu commented May 27, 2021

bohehe commented Jun 14, 2021

bohehe commented Jun 15, 2021 • edited Loading

darrenwatt commented Jun 16, 2021 • edited Loading

stale bot commented Jul 16, 2021

stale bot commented Jul 30, 2021

kingdonb commented Jan 17, 2022

fanux commented Apr 28, 2022

bohehe commented Jun 15, 2021 •

edited

Loading

darrenwatt commented Jun 16, 2021 •

edited

Loading