feat(kuma-cp) metrics #993

jakubdyszkiewicz · 2020-08-24T14:43:05Z

Summary

This PR introduces Control Plane metrics:

Latencies and response codes etc. for API Server/Admin Server/Bootstrap Server/DNS Server/SDS/XDS/KDS
XDS: summary of XDS generation (time, count)
XDS: active connections
SDS: summary of SDS generation (time, count)
SDS: cert generations
KDS: summary of KDS generation (time, count)
KDS: client-side stats
Store: latencies of underlying storage
Store cache: number of hits and misses for cache
Static Info about the CP
Leader election
Go (GC, threads etc) and process info

It does not include dashboards.

Implementation

My first approach was to use promauto with global default Prometheus registry and MustRegister that panic, but it was a disaster in tests, therefore I implicitly pass the registry in the Metric object.

I try to register metrics as "high" in Setup/component.go etc. as possible. I was trying to avoid spreading Prometheus code across the codebase.

For latencies, I try to use a Summary, not Histogram. You can read about the differences here https://prometheus.io/docs/practices/histograms/ as long as there is no aggregation, Histograms are just easier to use IMHO.

Documentation

todo After I introduce dashboards

Signed-off-by: Jakub Dyszkiewicz <[email protected]>

nickolaev · 2020-08-25T20:22:31Z

pkg/dns/server.go

 }

 func (h *SimpleDNSServer) parseQuery(m *dns.Msg) {
 	for _, q := range m.Question {
 		switch q.Qtype {
 		case dns.TypeA:
-			serverLog.Info("Query for " + q.Name)
+			serverLog.V(1).Info("query for " + q.Name)


V(1)? we need it?

I think it's overkill to log every DNS request on the info level. It's like with API Server, we don't request every single request. With many services that uses DNS, logs will be spammed with "Query for..."

Signed-off-by: Jakub Dyszkiewicz <[email protected]>

jakubdyszkiewicz added 7 commits August 24, 2020 13:07

feat(kuma-cp) expose metrics of Kuma CP

d099cff

Signed-off-by: Jakub Dyszkiewicz <[email protected]>

switch from global registry to local

feb6073

Signed-off-by: Jakub Dyszkiewicz <[email protected]>

metrics transform to passing metrics

e26e740

Signed-off-by: Jakub Dyszkiewicz <[email protected]>

Merge remote-tracking branch 'origin/master' into feat/cp-metrics

5d0586b

Signed-off-by: Jakub Dyszkiewicz <[email protected]>

fix typo

a1c19d0

Signed-off-by: Jakub Dyszkiewicz <[email protected]>

add metric tests

6c4a284

Signed-off-by: Jakub Dyszkiewicz <[email protected]>

add go and process metrics

72b6317

Signed-off-by: Jakub Dyszkiewicz <[email protected]>

jakubdyszkiewicz requested a review from a team as a code owner August 24, 2020 14:43

change summary to histogram for store

54191f0

Signed-off-by: Jakub Dyszkiewicz <[email protected]>

nickolaev approved these changes Aug 25, 2020

View reviewed changes

jakubdyszkiewicz added 2 commits August 27, 2020 09:50

extra stats

4390520

Signed-off-by: Jakub Dyszkiewicz <[email protected]>

Merge remote-tracking branch 'origin/master' into feat/cp-metrics

4a90191

Signed-off-by: Jakub Dyszkiewicz <[email protected]>

lobkovilya approved these changes Sep 8, 2020

View reviewed changes

jakubdyszkiewicz added 2 commits September 9, 2020 10:42

change e2e timeout

94597f1

Signed-off-by: Jakub Dyszkiewicz <[email protected]>

fix mux client

1157fdf

Signed-off-by: Jakub Dyszkiewicz <[email protected]>

jakubdyszkiewicz merged commit 4eb1773 into master Sep 9, 2020

jakubdyszkiewicz deleted the feat/cp-metrics branch September 9, 2020 10:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(kuma-cp) metrics #993

feat(kuma-cp) metrics #993

jakubdyszkiewicz commented Aug 24, 2020

nickolaev Aug 25, 2020

jakubdyszkiewicz Sep 8, 2020

feat(kuma-cp) metrics #993

feat(kuma-cp) metrics #993

Conversation

jakubdyszkiewicz commented Aug 24, 2020

Summary

Implementation

Documentation

nickolaev Aug 25, 2020

Choose a reason for hiding this comment

jakubdyszkiewicz Sep 8, 2020

Choose a reason for hiding this comment