Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to obtain kube-controller-manager and kube-scheduler monitoring information #913

Closed
ixiaoyi93 opened this issue Dec 29, 2018 · 19 comments

Comments

@ixiaoyi93
Copy link

What did you do?
I use helm to deploy prometheus-operator to monitor binary kubernetes clusters. However, it is not possible to obtain kube-controller-manager and kube-scheduler component data at present.
image

What did you expect to see?
I want to know how prometheus-operator monitors kubernetes Masters.

What did you see instead? Under which circumstances?
Look at the EP created by prometheus-operator and find that the ENDPOINTS field is .

$ kubectl get ep -n kube-system |grep "prometheus-operator*"
prometheus-operator-coredns                   172.20.4.2:9153,172.20.5.4:9153                                         36m
prometheus-operator-kube-controller-manager   <none>                                                                  36m
prometheus-operator-kube-etcd                 172.17.80.26:2379,172.17.80.27:2379,172.17.80.28:2379                   36m
prometheus-operator-kube-scheduler            <none>                                                                  36m
prometheus-operator-kubelet                   172.17.80.26:10255,172.17.80.27:10255,172.17.80.28:10255 + 15 more...   28d

Service port listening 0.0.0.0.

$ ss -lnpt |egrep "kube-controller|kube-scheduler"
LISTEN     0      2048        :::10251                   :::*                   users:(("kube-scheduler",pid=41235,fd=3))
LISTEN     0      2048        :::10252                   :::*                   users:(("kube-controller",pid=41326,fd=3))
LISTEN     0      2048        :::10257                   :::*                   users:(("kube-controller",pid=41326,fd=5))

Environment

  • Prometheus Operator version:
$ grep -A 3 "image:" /data/prometheus-operator-01/values.yaml
    image:
      repository: quay.io/prometheus/alertmanager
      tag: v0.15.3

--
  image:
    repository: quay.io/coreos/prometheus-operator
    tag: v0.26.0
    pullPolicy: IfNotPresent
--
    image:
      repository: quay.io/prometheus/prometheus
      tag: v2.5.0
  • Kubernetes version information:
Client Version: version.Info{Major:"1", Minor:"12", GitVersion:"v1.12.4", GitCommit:"f49fa022dbe63faafd0da106ef7e05a29721d3f1", GitTreeState:"clean", BuildDate:"2018-12-14T07:10:00Z", GoVersion:"go1.10.4", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"12", GitVersion:"v1.12.4", GitCommit:"f49fa022dbe63faafd0da106ef7e05a29721d3f1", GitTreeState:"clean", BuildDate:"2018-12-14T06:59:37Z", GoVersion:"go1.10.4", Compiler:"gc", Platform:"linux/amd64"}
  • Kubernetes cluster kind:
    Binary manual installation

  • Manifests:

  • Prometheus Operator Logs:

@okamototk
Copy link

I had same problem at following env:

  • Kubernetes 1.13.1(from kubespray)

I haven't resolve the problem but at least service label seems problem.

$ kubectl describe svc  kube-prometheus-exporter-kube-scheduler -nkube-system
Name:              kube-prometheus-exporter-kube-scheduler
Namespace:         kube-system
Labels:            app=exporter-kube-scheduler
                   chart=exporter-kube-scheduler-0.1.9
                   component=kube-scheduler
                   heritage=Tiller
                   release=kube-prometheus
Annotations:       <none>
Selector:          k8s-app=kube-scheduler  <-----------------------
Type:              ClusterIP
IP:                None
Port:              http-metrics  10251/TCP
TargetPort:        10251/TCP
Endpoints:         <none>   <-----------------------
Session Affinity:  None
Events:            <none>

exporter service require "k8s-app=kube-scheduler" label. But scheduler doesn't have label.

$ kubectl get pods -nkube-system  --show-labels  |grep schedu
kube-scheduler-master1.infra                           1/1     Running   0          18h   component=kube-scheduler,tier=control-plane
                                                                                          ~~~~~~~~~~~~~~~~~~~~~~
kube-scheduler-master2.infra                           1/1     Running   0          18h   component=kube-scheduler,tier=control-plane
kube-scheduler-master3.infra                           1/1     Running   0          18h   component=kube-scheduler,tier=control-plane

After change selector k8s-app to component as above result, I service worked fine.

$ kubectl describe svc -nkube-system  kube-prometheus-exporter-kube-scheduler
Name:              kube-prometheus-exporter-kube-scheduler
Namespace:         kube-system
Labels:            app=exporter-kube-scheduler
                   chart=exporter-kube-scheduler-0.1.9
                   component=kube-scheduler
                   heritage=Tiller
                   release=kube-prometheus
Annotations:       <none>
Selector:          component=kube-scheduler <-----------------------
Type:              ClusterIP
IP:                None
Port:              http-metrics  10251/TCP
TargetPort:        10251/TCP
Endpoints:         172.31.24.19:10251,172.31.27.64:10251,172.31.30.247:10251 <-----------------------
Session Affinity:  None
Events:            <none>

But metrics was not collected by prometheus. servicemonitor is following.

$ kubectl describe  servicemonitor/kube-prometheus-exporter-kube-scheduler -nkube-system
Name:         kube-prometheus-exporter-kube-scheduler
Namespace:    kube-system
Labels:       app=exporter-kube-scheduler
              chart=exporter-kube-scheduler-0.1.9
              component=kube-scheduler
              heritage=Tiller
              prometheus=kube-prometheus
              release=kube-prometheus
Annotations:  <none>
API Version:  monitoring.coreos.com/v1
Kind:         ServiceMonitor
Metadata:
  Creation Timestamp:  2019-01-01T00:26:32Z
  Generation:          1
  Resource Version:    180783
  Self Link:           /apis/monitoring.coreos.com/v1/namespaces/kube-system/servicemonitors/kube-prometheus-exporter-kube-scheduler
  UID:                 e3f4e7b0-0d5b-11e9-9228-0204419fcc46
Spec:
  Endpoints:
    Bearer Token File:  /var/run/secrets/kubernetes.io/serviceaccount/token
    Interval:           15s
    Port:               http-metrics
  Job Label:            component
  Namespace Selector:
    Match Names:
      kube-system
  Selector:
    Match Labels:
      App:        exporter-kube-scheduler
      Component:  kube-scheduler
Events:           <none>

Install Information

I install helm:

$ helm repo add coreos https://s3-eu-west-1.amazonaws.com/coreos-charts/stable/
$ helm repo update
$ helm install coreos/prometheus-operator --name prometheus-operator --namespace kube-system
$ helm install coreos/kube-prometheus --name kube-prometheus --namespace kube-system-f custom-values.yml

custom-values.yml

exporter-kube-controller-manager:
  serviceSelectorLabelKey: component

exporter-kube-controller-manager:
  serviceSelectorLabelKey: component

grafana:
  image:
    repository: grafana/grafana
    tag: 5.4.2
    # ElasticsearchのAlertingに対応したGrafanaをインストール
    # (デフォルトは5.0.0)

  auth:
    anonymous:
      enabled: "false"

  ingress:
    enabled: true
    annotations:
      ingress.kubernetes.io/ssl-redirect: "true"
      nginx.ingress.kubernetes.io/ssl-redirect: "true"
    hosts:
      - grafana.example.com
    tls:
      - secretName: ""
        hosts:
          - grafana.example.com
  storageSpec:
    class: standard
    accessMode: "ReadWriteMany"
    resources:
      requests:
        storage: 2Gi
    selector: {}

prometheus:
  image:
    repository: quay.io/prometheus/prometheus
    tag: v2.6.0
  storageSpec:
    volumeClaimTemplate:
      spec:
        storageClassName: standard
        accessModes: ["ReadWriteMany"]
        resources:
          requests:
            storage: 50Gi
      selector: {}

alertmanager:
  image:
    repository: quay.io/prometheus/alertmanager
    tag: v0.15.3
  storageSpec:
    volumeClaimTemplate:
      spec:
        storageClassName: standard
        accessModes: ["ReadWriteMany"]
        resources:
          requests:
            storage: 50Gi
      selector: {}

@okamototk
Copy link

I found the cause. I found following error on Prometheus targets panel:

kube-system/kube-prometheus-exporter-kube-scheduler/0 (0/3 up) 
Endpoint | State | Labels | Last Scrape | Scrape Duration | Error
http://172.31.24.19:10251/metrics | DOWN | endpoint="http-metrics" instance="172.31.24.19:10251" job="kube-scheduler" namespace="kube-system" pod="kube-scheduler-master3.infra"service="kube-prometheus-exporter-kube-scheduler" | 14.848s ago | 736.7us | 
Get http://172.31.24.19:10251/metrics: dial tcp 172.31.24.19:10251: connect: connection refused

Prometheus tried to got metrics from http://172.31.24.19:10251/metrics.
But scheduler port is exposed to only local 127.0.0.1 as follow:

$ sudo ss -tlnp
State     Recv-Q     Send-Q          Local Address:Port          Peer Address:Port
...
LISTEN    0          128                 127.0.0.1:10251              0.0.0.0:*        users:(("kube-scheduler",pid=5951,fd=3))

@okamototk
Copy link

okamototk commented Jan 1, 2019

I finally found the resolution. Use 0.0.0.0 for scheduler and contoller-manager bind address.
Following is kubeadm config example,

apiVersion: kubeadm.k8s.io/v1beta1
kind: InitConfiguration
...
controllerManager:
  extraArgs:
    address: "0.0.0.0"
scheduler:
  extraArgs:
    address: "0.0.0.0"

Then scheduler and controller manager ports are available.

$ ss -tlnp
State               Local Address:Port          Peer Address:Port
LISTEN                         *:10251                      *:*
LISTEN                         *:10252                      *:*

@brancz
Copy link
Collaborator

brancz commented Jan 7, 2019

@xiaomuyi does that solve your issue?

@ixiaoyi93
Copy link
Author

@xiaomuyi does that solve your issue?

Using the latest version has been resolved

@neoakris
Copy link

https://coreos.com/operators/prometheus/docs/latest/user-guides/cluster-monitoring.html
May also be of help, says you need to create services for them to be discovered.

@retr0h
Copy link

retr0h commented Aug 7, 2020

I have the same issue, altho my kubespray scheduler is only running on the https port. I'll need to dig into the helm chart at this point.

jodewey@csg-nscg-0001:~$ ps -ef | grep scheduler
root        2875    2853  0 17:12 ?        00:01:51 kube-scheduler xxxxx
jodewey   461907  458260  0 23:31 pts/0    00:00:00 grep scheduler
jodewey@csg-nscg-0001:~$ sudo lsof -p 2875 | grep TCP
kube-sche 2875 root    5u     IPv6              33254      0t0     TCP *:10259 (LISTEN)

Edit ...
Made the following changes to values.yaml to correct the issue:

kubeControllerManager:
  service:
    targetPort: 10257
  serviceMonitor:
    https: true
    insecureSkipVerify: false

kubeScheduler:
  service:
    targetPort: 10259
  serviceMonitor:
    https: true                                                                                                                                              
    insecureSkipVerify: false

@ksa-real
Copy link

The issue is there still/again. Problems:

  • Both kube-scheduler and kube-controller-manager have 127.0.0.1 as DN in TLS certificate. So scraping from a non-master host is only possible with insecureSkipVerify: true.
  • By default, both components bind to 127.0.0.1. Binding with extraArgs: {bind-address: "0.0.0.0"} works, but exposes the components too broadly: cluster-wide and probably even outside the cluster, depending on bound interfaces.

Is there a recommended approach to this issue? I see two approaches:

  • Design component endpoints with Prometheus scraper in mind: i.e. they provide a certificate that includes master nodes' IPs (similar to etcd). Bind to the main network interface instead of 127.0.0.1. This likely requires support by kubeadm.
  • Delegate scraping to some collector, ran as a DaemonSet. E.g. node-exporter can act as such. It may pass proxy request to the node and respond to Prometheus with the result.

@syednadeembe
Copy link

@okamototk @ksa-real
Can you please state how to use extraArgs: {bind-address: "0.0.0.0"} for controllerManager: and scheduler:

I updated the kubeadm-config comfig map as below but still facing the same issue.
#############################################################
kubectl describe cm kubeadm-config -n kube-system
#############################################################
Name: kubeadm-config
Namespace: kube-system
Labels:
Annotations:

Data

ClusterConfiguration:

apiServer:
extraArgs:
authorization-mode: Node,RBAC
cloud-provider: aws
feature-gates: TTLAfterFinished=true
timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta2
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager:
extraArgs:
address: "0.0.0.0"
dns:
type: CoreDNS
etcd:
local:
dataDir: /var/lib/etcd
imageRepository: k8s.gcr.io
kind: ClusterConfiguration
kubernetesVersion: v1.19.7
networking:
dnsDomain: cluster.local
podSubnet: 192.168.0.0/16
serviceSubnet: 10.96.0.0/12
scheduler:
extraArgs:
address: "0.0.0.0"

ClusterStatus:

apiEndpoints:
vmnxkubdvm01.eur.ad.sag:
advertiseAddress: 10.60.27.248
bindPort: 6443
apiVersion: kubeadm.k8s.io/v1beta2
kind: ClusterStatus

Events:

@paulfantom paulfantom transferred this issue from prometheus-operator/prometheus-operator Feb 3, 2021
@weibeld
Copy link

weibeld commented Feb 5, 2021

@syednadeembe you're using the --address flag for kube-controller-manager and kube-scheduler, which is deprecated. The new flag is --bind-address. So, the kubeadm config file should look like this:

apiVersion: kubeadm.k8s.io/v1beta2
kind: ClusterConfiguration
...
controllerManager:
  extraArgs:
    bind-address: 0.0.0.0
scheduler:
  extraArgs:
    bind-address: 0.0.0.0

In any case, to verify that the setting has been applied, you can do kubectl get pod kube-controller-manager-<...> -o yaml and check the command with which the container has been started in spec.containers[0].command. There, the kube-controller-manager command should have a flag saying --bind-address=0.0.0.0.

@syednadeembe
Copy link

the pod has the required flag and i can verify that while doing the pod describe:

spec:
containers:

  • command:
    • kube-controller-manager
    • --allocate-node-cidrs=true
    • --authentication-kubeconfig=/etc/kubernetes/controller-manager.conf
    • --authorization-kubeconfig=/etc/kubernetes/controller-manager.conf
    • --bind-address=0.0.0.0
    • --client-ca-file=/etc/kubernetes/pki/ca.crt
    • --cluster-cidr=192.168.0.0/16
      However the problem still remains.

#######################################################################
Also will like to know, does editing kubeadm-config configmap update the kubeadm config ?
Example: kubectl edit cm kubeadm-config -n kube-system
will this update my cluster at the runtime ?

@syednadeembe
Copy link

what is the recommended way to update kubeadm config file for a running cluster ?

@syednadeembe
Copy link

syednadeembe commented Feb 5, 2021

Resolved
########################################
After couple of endless nights, alladin's gini finally came to help me :) @retr0h comment
########################################
Etcd, kube-controller-manager and kube-scheduler are servers that by default server request on https when one uses kubeadm to install the cluster.

The default values.yml file which comes with prometheus helm release only scraps at http unless specified elsewise.
Follow was the course of action for me:

helm fetch prometheus-community/prometheus-operator ### to get the default values that had been used
// updated the values.yml file

kubeControllerManager:
enabled: true

service:
port: 10257
targetPort: 10257

https: true
# Skip TLS certificate validation when scraping
insecureSkipVerify: true

// Ran the helm upgrade
helm upgrade prometheus-operator prometheus-operator/ -n <name_space>

@weibeld
Copy link

weibeld commented Feb 10, 2021

Since setting --bind-address to 0.0.0.0 might expose the port to the Internet (as reported in kubernetes/kubeadm#2244 (comment)), an alternative solution is using a local proxy on each master node that exposes the metrics in a secure way.

This could be implemented with an HAProxy container run on each master node with a DaemonSet with the following configuration:

defaults
  mode http
  timeout connect 5000ms
  timeout client 5000ms
  timeout server 5000ms
  default-server maxconn 10

frontend kube-controller-manager
  bind ${NODE_IP}:10257
  http-request deny if !{ path /metrics }
  default_backend kube-controller-manager
backend kube-controller-manager
  server kube-controller-manager 127.0.0.1:10257 ssl verify none

frontend kube-scheduler
  bind ${NODE_IP}:10259
  http-request deny if !{ path /metrics }
  default_backend kube-scheduler
backend kube-scheduler
  server kube-scheduler 127.0.0.1:10259 ssl verify none

This would listen on port 10257 on the node's IP address for requests to the /metrics path and forward these requests to the only locally available port 10257 of kube-controller-manager on the loopack interface.

The $NODE_IP environment variable can be injected into the Pod with a fieldRef in the Pod spec:

env:
- name: NODE_IP
  valueFrom:
    fieldRef:
      apiVersion: v1
      fieldPath: status.hostIP

The proxy Pod must run in the hostNetwork, like the kube-controller-manager Pod, so that the former can access the latter's loopback interface.

@sachinmsft
Copy link

@weibeld should not you need to pass bearer token with cluster role binding on scrapping metrics endpoint. ?

@KlavsKlavsen
Copy link
Contributor

This could be implemented with an HAProxy container run on each master node with a DaemonSet with the following configuration:

@weibeld Sounds like a fantastic solution - albeit a big more complex ofcourse :)
Have you implemented it with the kube-prometheus build jsonnet system - or using the Helm chart?
I would be VERY happy if you would share some more details about this?

As I understand it - we need a daemonset running haproxy - with the config you mentioned above - and a modified service definition from Prometheus - to make it try to reach the nodes kubecontrolermanager and kube-scheduler pods - via node-ip instead

@weibeld
Copy link

weibeld commented Mar 11, 2022

@KlavsKlavsen I used Kustomize, since all the other services in the cluster were deployed with Kustomize. But it really doesn't matter how you deploy the DaemonSet, best if you just deploy it like any other application in your cluster, e.g. Helm if you usually use Helm.

Once you have the DaemonSet, you can scrape the exposed ports as a normal Prometheus target, i.e. either with a Service and a ServiceMonitor, or with a PodMonitor.

Here's the complete configuration I used:

apiVersion: v1
kind: ConfigMap
metadata:
  name: metrics-proxy-master
data:
  haproxy.cfg: |
    defaults
      mode http
      timeout connect 5000ms
      timeout client 5000ms
      timeout server 5000ms
      default-server maxconn 10

    frontend kube-controller-manager
      bind ${NODE_IP}:10257
      http-request deny if !{ path /metrics }
      default_backend kube-controller-manager
    backend kube-controller-manager
      server kube-controller-manager 127.0.0.1:10257 ssl verify none

    frontend kube-scheduler
      bind ${NODE_IP}:10259
      http-request deny if !{ path /metrics }
      default_backend kube-scheduler
    backend kube-scheduler
      server kube-scheduler 127.0.0.1:10259 ssl verify none

    frontend etcd
      bind ${NODE_IP}:2381
      http-request deny if !{ path /metrics }
      default_backend etcd
    backend etcd
      server etcd 127.0.0.1:2381
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: metrics-proxy-master
spec:
  selector:
    matchLabels:
      app: metrics-proxy-master
  template:
    metadata:
      labels:
        app: metrics-proxy-master
    spec:
      hostNetwork: true
      serviceAccountName: metrics-proxy
      nodeSelector:
        node-role.kubernetes.io/master: ""
      tolerations:
        - effect: NoSchedule
          key: node-role.kubernetes.io/master
          operator: Exists
      containers:
        - name: haproxy
          image: haproxy:2.2.8
          env:
            - name: NODE_IP
              valueFrom:
                fieldRef:
                  apiVersion: v1
                  fieldPath: status.hostIP
          ports:
            - name: kube-ctrl-mgr  # Port names are limited to 15 characters
              containerPort: 10257
            - name: kube-scheduler
              containerPort: 10259
            - name: etcd
              containerPort: 2381
          volumeMounts:
            - mountPath: /usr/local/etc/haproxy
              name: haproxy-config
      volumes:
        - configMap:
            name: metrics-proxy-master
          name: haproxy-config

Then you can scrape the metrics of, for example, kube-controller-manager with a PodMonitor:

apiVersion: monitoring.coreos.com/v1
kind: PodMonitor
metadata:
  name: kube-controller-manager
spec:
  selector:
    matchLabels:
      app: metrics-proxy-master
  namespaceSelector:
    matchNames:
      - monitoring
  podMetricsEndpoints:
    - port: kube-ctrl-mgr
      bearerTokenSecret:
        name: metrics-endpoint-reader
        key: token
      relabelings:
        - targetLabel: node
          sourceLabels: [__meta_kubernetes_pod_node_name]
        - targetLabel: job
          replacement: kube-controller-manager

Note that the metrics endpoints of kube-controller-manager and kube-scheduler require RBAC authorisation through a ServiceAccount token with at least read permissions for the /metrics non-resource URL. This is configured through the bearerTokenSecret field in the above PodMonitor definition. See the below comment for how to create this Secret.

@weibeld
Copy link

weibeld commented Mar 11, 2022

@sachinmsft

@weibeld should not you need to pass bearer token with cluster role binding on scrapping metrics endpoint. ?

Yes, this is required for kube-controller-manager and kube-scheduler. You can create a Secret with a ServiceAccount token with the appropriate permissions like this:

apiVersion: v1
kind: ServiceAccount
metadata:
  name: metrics-endpoint-reader
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: metrics-endpoint-reader
rules:
  - nonResourceURLs: [/metrics]
    verbs: [get]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: metrics-endpoint-reader
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: metrics-endpoint-reader
subjects:
  - kind: ServiceAccount
    name: metrics-endpoint-reader
    namespace: monitoring
---
apiVersion: v1
kind: Secret
metadata:
  name: metrics-endpoint-reader
  annotations:
    kubernetes.io/service-account.name: metrics-endpoint-reader
type: kubernetes.io/service-account-token

Then you can reference the created Secret by its name (e.g. metrics-endpoint-reader) in the bearerTokenSecret field of either a ServiceMonitor or a PodMonitor (see the above comment).

@krishgu
Copy link

krishgu commented Mar 20, 2023

@weibeld what are the roles/RBACs for the metrics-proxy service-account?

thanks for putting this awesome solution together.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests