-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add downsampling metrics #1244
Add downsampling metrics #1244
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it makes total sense (: Thanks.
Let's move to MustRegister
and then LGTM. 👍
Thanks for reviewing, have updated to use |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome, one nit, but not a blocker (:
} | ||
} | ||
|
||
type DownsampleMetrics struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it can be private, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One minor nit. Let me know if it makes sense. I think doing like I suggested makes the code base cleaner - there's less chance that we will have clashes between different "subsystems" who might try to register the same metrics
@@ -65,6 +92,8 @@ func runDownsample( | |||
} | |||
}() | |||
|
|||
metrics := newDownsampleMetrics(reg) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we use extprom.WrapRegistererWithPrefix("thanos_compact_downsample", reg)
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it makes sense but with thanos_compact
prefix, but not a blocker.
func newDownsampleMetrics(reg *prometheus.Registry) *DownsampleMetrics { | ||
m := new(DownsampleMetrics) | ||
|
||
m.downsamples = prometheus.NewCounterVec(prometheus.CounterOpts{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then we would only have total
here
Merging in this form, we can improve that later once we actually need to reuse this component. |
Thanks @mattrco ! |
When
compact
goes into its downsampling step, no app metrics are reported, which makes our dashboard look a bit bare for several hours (sometimes days if we have a backlog). Adding a couple of metrics means we can alert if no compactions or downsamples have happened in x period.This is a little less clean than I'd like due to the two callers of
downsampleBucket
having slightly different contexts, however I think it's a reasonable compromise. Let me know what you think!The metrics are labelled in the same way as compact metrics for ease of comparison.
Changes
Add
thanos_compact_downsample_total
andthanos_compact_downsample_failures_total
downsampling metrics.Verification
Tests are passing, will run with real workloads.