Consider rewriting latency metrics #72

FZambia · 2016-02-20T07:13:42Z

In Centrifugo 1.4.0 we deprecated timers. Here is an issue to consider making them right and put back in future releases.

This includes: client request time, API request time.

As @banks said in #68:

If we really want them we should use something like https://www.godoc.org/github.com/vulcand/vulcan/metrics#RollingHistogram which uses a robust measurement technique (HDRHistrogram) and supports rolling window of these to have an accurate distribution of the last N minutes etc. If we did that, I'd recommend having something like 15 buckets of 1 min, with each histogram having range 100µs to 10 mins (600mill µs) with 3 significant figures. That gives you very high definition for last 15 mins in a bit over 1MB per timer (on my mac). That seems very reasonable if we have ~20 timers on a modern machine. We could make it configurable if we want to support people on super low memory VPS or something. Then expose in the results from stats at least:
mean
50%ile
99%ile
99.99%ile
max
Looks like it's not threadsafe though so would need to put a mutex around it or send results on a lightly buffered channel and have it update in single goroutine

FZambia · 2016-07-28T21:13:41Z

done in #97 , will be released with next Centrifugo version

FZambia · 2016-09-12T07:39:50Z

I am not very happy with how metrics currently implemented. This is current NodeInfo struct:

type NodeInfo struct {
    mu         sync.RWMutex
    UID        string `json:"uid"`
    Name       string `json:"name"`
    Goroutines int    `json:"num_goroutine"`
    Clients    int    `json:"num_clients"`
    Unique     int    `json:"num_unique_clients"`
    Channels   int    `json:"num_channels"`
    Started    int64  `json:"started_at"`
    Gomaxprocs int    `json:"gomaxprocs"`
    NumCPU     int    `json:"num_cpu"`
    metrics
    updated int64
}

It's located in metrics.go but actually only part of it are metrics... Something like this would be much better:

type NodeInfo struct {
    mu       sync.RWMutex
    UID      string `json:"uid"`
    Name     string `json:"name"`
    Metrics  map[string]int64 `json:"metrics"`
    updated  int64
}

This will allow to abstract various counters and operate by semantic names in code rather than concrete struct fields (i.e. something like Histogram registry but for counters). Also I think it's a good idea to include latencies into that metrics map as flat keys without extra latencies level as it now. Also I don't think we will need floats for metrics. This will allow to keep metrics specific code into separate package.

This is a breaking change unfortunately:(

FZambia · 2016-09-20T07:03:59Z

Current latencies implementation is like quantum measurement:) Asking for metrics affects latency metrics significantly. So per specific API command histograms could really make sense. But this is only possible to insert after net/http and request decoding work already done. Will try to investigate this more.

FZambia · 2016-12-17T11:49:56Z

done in v1.6.0

FZambia mentioned this issue Jul 12, 2016

Move connection and API request log level to DEBUG #96

Closed

FZambia closed this as completed Dec 17, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consider rewriting latency metrics #72

Consider rewriting latency metrics #72

FZambia commented Feb 20, 2016

FZambia commented Jul 28, 2016

FZambia commented Sep 12, 2016 •

edited

Loading

FZambia commented Sep 20, 2016

FZambia commented Dec 17, 2016

Consider rewriting latency metrics #72

Consider rewriting latency metrics #72

Comments

FZambia commented Feb 20, 2016

FZambia commented Jul 28, 2016

FZambia commented Sep 12, 2016 • edited Loading

FZambia commented Sep 20, 2016

FZambia commented Dec 17, 2016

FZambia commented Sep 12, 2016 •

edited

Loading