-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add prometheus metrics #2466
add prometheus metrics #2466
Conversation
@tifayuki For simplicity can we split this into 3 separate PRs? I think it will be easier for reviewers and you too :) |
@manishtomar |
@tifayuki Are you still working on this? |
@stevvooe |
registry/storage/driver/base/base.go
Outdated
return base.setDriverName(base.StorageDriver.Move(ctx, sourcePath, destPath)) | ||
start := time.Now() | ||
err := base.setDriverName(base.StorageDriver.Move(ctx, sourcePath, destPath)) | ||
prometheus.BlobAction.WithValues("move").UpdateSince(start) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't really a blob action. It is a storage driver action.
metrics/prometheus/prometheus.go
Outdated
|
||
var ( | ||
// BlobAction is the metrics of blob related operations | ||
BlobAction = storageNamespace.NewLabeledTimer("blob_action", "The number of seconds it takes to get/put blob", "action") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are used for storage actions, not blob actions.
metrics/prometheus/prometheus.go
Outdated
|
||
const ( | ||
// Namespace is the namespace of prometheus metrics | ||
Namespace = "docker_distribution" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer if this were registry
or something like that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait, this isn't actually a namespace: define a single namespace here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or, call it NamespacePrefix
to be clear.
@@ -129,6 +129,11 @@ type Configuration struct { | |||
Debug struct { | |||
// Addr specifies the bind address for the debug server. | |||
Addr string `yaml:"addr,omitempty"` | |||
// Prometheus configures the Prometheus telemetry endpoint. | |||
Prometheus struct { | |||
Enabled bool `yaml:"enabled,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Document what this fields do here and in the documentation.
metrics/prometheus/prometheus.go
Outdated
@@ -0,0 +1,28 @@ | |||
package prometheus |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to have it be called "prometheus". Just have a single package named metrics
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, rather than defining all the metrics here, just define the namespace here and define the metrics where they are used.
@@ -213,6 +213,9 @@ http: | |||
email: [email protected] | |||
debug: | |||
addr: localhost:5001 | |||
prometheus: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There needs to be a section for this.
@@ -213,6 +213,9 @@ http: | |||
email: [email protected] | |||
debug: | |||
addr: localhost:5001 | |||
prometheus: | |||
enabled: true | |||
path: /metrics |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a default?
registry/storage/driver/base/base.go
Outdated
fi, e := base.StorageDriver.Stat(ctx, path) | ||
prometheus.BlobAction.WithValues("state").UpdateSince(start) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
stat
, not state
.
registry/storage/driver/base/base.go
Outdated
storagedriver "github.com/docker/distribution/registry/storage/driver" | ||
"time" | ||
) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sure to decorate all the operations. Do we have a way to see writer duration? Writer bytes?
registry/storage/driver/base/base.go
Outdated
str, e := base.StorageDriver.List(ctx, path) | ||
prometheus.BlobAction.WithValues("list").UpdateSince(start) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can these somehow include the name of the driver that is in use? That would be helpful for comparing driver implementations in production.
metrics/prometheus/prometheus.go
Outdated
BlobAction = storageNamespace.NewLabeledTimer("blob_action", "The number of seconds it takes to get/put blob", "action") | ||
|
||
// CacheRequestCount is the number of total cache request received/hits/misses | ||
CacheRequestCount = storageNamespace.NewLabeledCounter("cache_request", "The number of cache request received", "type") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cache
or something like that is fine. CacheRequest
makes me think of counting requests that were cached.
As part of this PR, please provide an output of the metrics so we can ensure the format looks correct. |
Codecov Report
@@ Coverage Diff @@
## master #2466 +/- ##
==========================================
- Coverage 60.81% 51.69% -9.13%
==========================================
Files 129 126 -3
Lines 11830 11500 -330
==========================================
- Hits 7195 5945 -1250
- Misses 3733 4810 +1077
+ Partials 902 745 -157
Continue to review full report at Codecov.
|
registry/storage/driver/base/base.go
Outdated
) | ||
|
||
var ( | ||
// storageAction is the metrics of blob related operations | ||
storageAction = prometheus.StorageNamespace.NewLabeledTimer("storage_action", "The number of seconds that the storage action takes", "driver", "action") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This stutters: it becomes registry_storage_storage_action
. Should just be action
here.
sample output with push/pull:
|
registry/registry.go
Outdated
@@ -317,6 +328,11 @@ func alive(path string, handler http.Handler) http.Handler { | |||
|
|||
handler.ServeHTTP(w, r) | |||
}) | |||
|
|||
httpMetrics := prometheus.RootHTTPNamespace.NewDefaultHttpMetrics("root") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this counter is necessary: it will count for all requests. This can just be rolled up from the other counters.
registry/storage/driver/base/base.go
Outdated
return base.setDriverName(base.StorageDriver.PutContent(ctx, path, content)) | ||
start := time.Now() | ||
err := base.setDriverName(base.StorageDriver.PutContent(ctx, path, content)) | ||
storageAction.WithValues(base.Name(), "putcontent").UpdateSince(start) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should use camelcase: PutContent
.
40d9313
to
14dc17a
Compare
LGTM |
fffacfc
to
5752fc4
Compare
89503d6
to
beccba4
Compare
at the first iteration, only the following metrics are collected: - HTTP metrics of each API endpoint - cache counter for request/hit/miss - histogram of storage actions, including: GetContent, PutContent, Stat, List, Move, and Delete Signed-off-by: tifayuki <[email protected]>
beccba4
to
e3c37a4
Compare
Signed-off-by: tifayuki [email protected]
depends on:
docker/go-metrics#15
docker/go-metrics#16
cc @nandhini915 @manishtomar