Skip to content

Commit

Permalink
Merge pull request #1380 from torredil/metrics
Browse files Browse the repository at this point in the history
Add enableMetrics configuration
  • Loading branch information
k8s-ci-robot authored Sep 23, 2022
2 parents 6d70467 + 666f2bc commit 03cde05
Show file tree
Hide file tree
Showing 4 changed files with 137 additions and 0 deletions.
8 changes: 8 additions & 0 deletions charts/aws-ebs-csi-driver/templates/controller.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -83,6 +83,9 @@ spec:
{{- with .Values.controller.k8sTagClusterId }}
- --k8s-tag-cluster-id={{ . }}
{{- end }}
{{- if and (.Values.controller.enableMetrics) (not .Values.controller.httpEndpoint) }}
- --http-endpoint=0.0.0.0:3301
{{- end}}
{{- with .Values.controller.httpEndpoint }}
- --http-endpoint={{ . }}
{{- end }}
Expand Down Expand Up @@ -137,6 +140,11 @@ spec:
- name: healthz
containerPort: 9808
protocol: TCP
{{- if .Values.controller.enableMetrics }}
- name: metrics
containerPort: 3301
protocol: TCP
{{- end}}
livenessProbe:
httpGet:
path: /healthz
Expand Down
40 changes: 40 additions & 0 deletions charts/aws-ebs-csi-driver/templates/metrics.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
{{- if .Values.controller.enableMetrics -}}
---
apiVersion: v1
kind: Service
metadata:
name: ebs-csi-controller
namespace: kube-system
labels:
app: ebs-csi-controller
spec:
selector:
app: ebs-csi-controller
ports:
- name: metrics
port: 3301
targetPort: 3301
type: ClusterIP
---
{{- if (.Capabilities.APIVersions.Has "monitoring.coreos.com/v1") -}}
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: ebs-csi-controller
namespace: kube-system
labels:
app: ebs-csi-controller
release: prometheus
spec:
selector:
matchLabels:
app: ebs-csi-controller
namespaceSelector:
matchNames:
- kube-system
endpoints:
- targetPort: 3301
path: /metrics
interval: 15s
{{- end }}
{{- end }}
8 changes: 8 additions & 0 deletions charts/aws-ebs-csi-driver/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -126,6 +126,14 @@ controller:
# key2: value2
extraVolumeTags: {}
httpEndpoint:
# (deprecated) The TCP network address where the prometheus metrics endpoint
# will run (example: `:8080` which corresponds to port 8080 on local host).
# The default is empty string, which means metrics endpoint is disabled.
# ---
enableMetrics: false
# If set to true, AWS API call metrics will be exported to the following
# TCP endpoint: "0.0.0.0:3301"
# ---
# ID of the Kubernetes cluster used for tagging provisioned EBS volumes (optional).
k8sTagClusterId:
logLevel: 2
Expand Down
81 changes: 81 additions & 0 deletions docs/metrics.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
# Driver Metrics

## Prerequisites

1. Install [Prometheus Operator](https://github.com/prometheus-operator/prometheus-operator) in your cluster:
```sh
$ helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
$ helm repo update
$ helm install prometheus prometheus-community/kube-prometheus-stack
```
2. Enable metrics by setting `enableMetrics: true` in [values.yaml](https://github.com/kubernetes-sigs/aws-ebs-csi-driver/blob/master/charts/aws-ebs-csi-driver/values.yaml).

3. Deploy EBS CSI Driver:
```sh
$ helm upgrade --install aws-ebs-csi-driver --namespace kube-system ./charts/aws-ebs-csi-driver --values ./charts/aws-ebs-csi-driver/values.yaml
```

## Overview

Installing the Prometheus Operator and enabling metrics will deploy a [Service](https://kubernetes.io/docs/concepts/services-networking/service/) object that exposes the EBS CSI Driver's controller metric port through a `ClusterIP`. Additionally, a [ServiceMonitor](https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/user-guides/getting-started.md#:~:text=Alertmanager-,ServiceMonitor,-See%20the%20Alerting) object is deployed which updates the Prometheus scrape configuration and allows scraping metrics from the endpoint defined. For more information, see the manifest [metrics.yaml](/charts/aws-ebs-csi-driver/templates/metrics.yaml)

## AWS API Metrics

The EBS CSI Driver will emit [AWS API](https://docs.aws.amazon.com/AWSEC2/latest/APIReference/OperationList-query.html) metrics to the following TCP endpoint: `0.0.0.0:3301/metrics` if `enableMetrics: true` has been configured in the Helm chart.

The metrics will appear in the following format:
```sh
# HELP cloudprovider_aws_api_request_duration_seconds [ALPHA] Latency of AWS API calls
# TYPE cloudprovider_aws_api_request_duration_seconds histogram
cloudprovider_aws_api_request_duration_seconds_bucket{request="AttachVolume",le="0.005"} 0
cloudprovider_aws_api_request_duration_seconds_bucket{request="AttachVolume",le="0.01"} 0
cloudprovider_aws_api_request_duration_seconds_bucket{request="AttachVolume",le="0.025"} 0
cloudprovider_aws_api_request_duration_seconds_bucket{request="AttachVolume",le="0.05"} 0
cloudprovider_aws_api_request_duration_seconds_bucket{request="AttachVolume",le="0.1"} 0
cloudprovider_aws_api_request_duration_seconds_bucket{request="AttachVolume",le="0.25"} 0
cloudprovider_aws_api_request_duration_seconds_bucket{request="AttachVolume",le="0.5"} 0
cloudprovider_aws_api_request_duration_seconds_bucket{request="AttachVolume",le="1"} 1
cloudprovider_aws_api_request_duration_seconds_bucket{request="AttachVolume",le="2.5"} 1
cloudprovider_aws_api_request_duration_seconds_bucket{request="AttachVolume",le="5"} 1
cloudprovider_aws_api_request_duration_seconds_bucket{request="AttachVolume",le="10"} 1
cloudprovider_aws_api_request_duration_seconds_bucket{request="AttachVolume",le="+Inf"} 1
cloudprovider_aws_api_request_duration_seconds_sum{request="AttachVolume"} 0.547694574
cloudprovider_aws_api_request_duration_seconds_count{request="AttachVolume"} 1
...
```

To manually scrape AWS metrics:
```sh
$ export ebs_csi_controller=$(kubectl get lease -n kube-system ebs-csi-aws-com -o=jsonpath="{.spec.holderIdentity}")
$ kubectl port-forward $ebs_csi_controller 3301:3301 -n kube-system
$ curl 127.0.0.1:3301/metrics
```

## Volume Stats Metrics

The EBS CSI Driver emits Kubelet mounted volume metrics for volumes created with the driver.

The following metrics are currently supported:

| Metric name | Metric type | Description | Labels |
|-------------|-------------|-------------|-------------|
|kubelet_volume_stats_capacity_bytes|Gauge|The capacity in bytes of the volume|namespace=\<persistentvolumeclaim-namespace\> <br/> persistentvolumeclaim=\<persistentvolumeclaim-name\>|
|kubelet_volume_stats_available_bytes|Gauge|The number of available bytes in the volume|namespace=\<persistentvolumeclaim-namespace\> <br/> persistentvolumeclaim=\<persistentvolumeclaim-name\>|
|kubelet_volume_stats_used_bytes|Gauge|The number of used bytes in the volume|namespace=\<persistentvolumeclaim-namespace\> <br/> persistentvolumeclaim=\<persistentvolumeclaim-name\>|
|kubelet_volume_stats_inodes|Gauge|The maximum number of inodes in the volume|namespace=\<persistentvolumeclaim-namespace\> <br/> persistentvolumeclaim=\<persistentvolumeclaim-name\>|
|kubelet_volume_stats_inodes_free|Gauge|The number of free inodes in the volume|namespace=\<persistentvolumeclaim-namespace\> <br/> persistentvolumeclaim=\<persistentvolumeclaim-name\>|
|kubelet_volume_stats_inodes_used|Gauge|The number of used inodes in the volume|namespace=\<persistentvolumeclaim-namespace\> <br/> persistentvolumeclaim=\<persistentvolumeclaim-name\>|

For more information about the supported metrics, see `VolumeUsage` within the CSI spec documentation for the [NodeGetVolumeStats](https://github.com/container-storage-interface/spec/blob/master/spec.md#nodegetvolumestats) RPC call.

For more information about metrics in Kubernetes, see the [Metrics For Kubernetes System Components](https://kubernetes.io/docs/concepts/cluster-administration/system-metrics/#metrics-in-kubernetes) documentation.

## CSI Operations Metrics

The `csi_operations_seconds metrics` reports a latency histogram of kubelet-initiated CSI gRPC calls by gRPC status code.

To manually scrape Kubelet metrics:
```sh
$ kubectl proxy
$ kubectl get --raw /api/v1/nodes/<insert_node_name>/proxy/metrics
```

0 comments on commit 03cde05

Please sign in to comment.