add metrics endpoint to monitor API requests #831

pitwegner · 2023-06-20T12:46:58Z

/kind feature

Describe the solution you'd like
I would like a metrics endpoint for data like hcloud_api_requests_total, so that the API requests to Hetzner can be monitored.

Anything else you would like to add:
Similar endpoints are implemented for hcloud-cloud-controller-manager and csi-driver.

The text was updated successfully, but these errors were encountered:

janiskemper · 2023-06-21T09:36:29Z

Sounds like a good feature. Would you be able to do a PR? We can support you there but lack currently the resources to focus on this.

pitwegner · 2023-06-21T10:17:20Z

My time is also currently limited. I'll try and see what I can do.

apricote · 2023-06-21T10:28:16Z

For the hcloud-go client you need to call WithInstrumentation(registry *prometheus.Registry). I am not sure if this properly handles multiple clients writing to the same registry, never tested this. AFAIK CAPH creates a new client per reconciliation loop.

apricote · 2023-07-14T12:18:35Z

We just discovered that its not possible right now to have multiple instrumented clients on the same registry: hetznercloud/hcloud-go#288

Working on a fix though.

apricote · 2023-09-19T18:59:40Z

Hey all :)

the metrics endpoint already exists in beta.22, it just binds to localhost and is not enumerated in the Pod, but if you port-forward you can access the metrics from controller-runtime:

$ kubectl port-forward -n caph-system deployments/caph-controller-manager 8080:8080
$ curl localhost:8080/metrics | tail
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 70717    0 70717    0     0   750k      0 --:--:-- --:--:-- --:--:--  758k
workqueue_work_duration_seconds_bucket{name="hetznercluster",le="9.999999999999999e-06"} 0
workqueue_work_duration_seconds_bucket{name="hetznercluster",le="9.999999999999999e-05"} 0
workqueue_work_duration_seconds_bucket{name="hetznercluster",le="0.001"} 3
workqueue_work_duration_seconds_bucket{name="hetznercluster",le="0.01"} 21
workqueue_work_duration_seconds_bucket{name="hetznercluster",le="0.1"} 21
workqueue_work_duration_seconds_bucket{name="hetznercluster",le="1"} 78
workqueue_work_duration_seconds_bucket{name="hetznercluster",le="10"} 111
workqueue_work_duration_seconds_bucket{name="hetznercluster",le="+Inf"} 111
workqueue_work_duration_seconds_sum{name="hetznercluster"} 97.379931617
workqueue_work_duration_seconds_count{name="hetznercluster"} 111

Adding the metrics for hcloud-go was also pretty easy. Only issue is that controller-runtime hides their default registry (which is exported at localhost:8080) behind a custom interface. This requires a type assertation back to *prometheus.Registry so we can pass it to WithInstrumentation().

I will create a PR for the hcloud-go changes.

Not sure if we want a "world-readable" metrics endpoint by default, there is an open CAPI discussion (kubernetes-sigs/cluster-api#7957) as well as a new controller-runtime feature (kubernetes-sigs/controller-runtime#2407) related to this.

batistein · 2023-09-26T23:27:34Z

closing because this is already merged

pitwegner changed the title ~~add metrics endpoint~~ add metrics endpoint to monitor API requests Jun 20, 2023

apricote mentioned this issue Sep 19, 2023

✨ Add metrics from hcloud-go #924

Merged

3 tasks

batistein closed this as completed Sep 26, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add metrics endpoint to monitor API requests #831

add metrics endpoint to monitor API requests #831

pitwegner commented Jun 20, 2023

janiskemper commented Jun 21, 2023

pitwegner commented Jun 21, 2023

apricote commented Jun 21, 2023

apricote commented Jul 14, 2023

apricote commented Sep 19, 2023

batistein commented Sep 26, 2023

add metrics endpoint to monitor API requests #831

add metrics endpoint to monitor API requests #831

Comments

pitwegner commented Jun 20, 2023

janiskemper commented Jun 21, 2023

pitwegner commented Jun 21, 2023

apricote commented Jun 21, 2023

apricote commented Jul 14, 2023

apricote commented Sep 19, 2023

batistein commented Sep 26, 2023