-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
metrics: refactor histogram bucket generation and testing #107388
metrics: refactor histogram bucket generation and testing #107388
Conversation
bf1cb97
to
81440e9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes sense to me! I have a non-blocking question about how to integrate this with #104302. If we want to implement a rule like "optionally enable native histograms for all histograms following an exponential distribution" or "optionally enable native histograms for all histograms that represent a size", would we want to make distribution or histogram type public? We might also want to pass a StaticBucketConfig to NewHistogram rather than a []float64 of buckets. Or should those details not be exposed elsewhere?
81440e9
to
f920b0c
Compare
I like those rules! We'd be defining said rules within the metrics package, right? In general, I don't think we want to export types outside their package until we have a reason to do so. So I've just made these types "private" for now - is that what you were suggesting?
I like the idea of keeping bucket generation in its own file ( I imagine the configs (distribution, min, max, count) for these histograms will continue to be per-category of metric (e.g., IO Latency metrics). And the fit of that category will be tested by (forthcoming) integration tests, which ideally will be able to suggest a better fit, or a new category. Does that sound sensible? |
What I'm wondering is: how should we tell Or we could do something more explicit, like adding Yet another option: we could expose native histograms for many or even all histograms, then decide whether we want to use conventional or native buckets on the prometheus side. What's your preference? I also don't want to hold up this PR, so we can also discuss in #104302. |
That makes sense. The only issue I can see is that we might want to pass non-generated, static buckets (for tests mainly) to |
I don't really have a preference on where the
This wouldn't be a terrible idea since native histograms are so space-efficient... We'd just want the bucket sizes to be big enough to make up for any precision lost for the metrics that don't follow exponential distributions. Any other cons to using this approach? |
82680c0
to
203ec16
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TFTR @abarganier!
I think I got to all your comments. Reworked the StaticBucketConfigs
map. PTAL.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @abarganier, @msirek, @rhu713, and @samiskin)
pkg/ccl/sqlproxyccl/metrics.go
line 240 at r3 (raw file):
don't we have these static buckets already assigned to public vars in the
metric
package?
On master, we have exported vars for the static buckets (of type []float64
), to pass to NewHistogram
here. But we are now passing bucket configs to NewHistogram
.
If the concern is with the performance of map lookups, we can use ints instead of strings as the keys for O(1) lookups? I went ahead and made that change, but happy to rethink that.
pkg/util/metric/histogram_buckets.go
line 26 at r1 (raw file):
Previously, abarganier (Alex Barganier) wrote…
nit: comments for this and other newly types created in this file.
Done.
pkg/util/metric/histogram_buckets.go
line 32 at r1 (raw file):
Previously, abarganier (Alex Barganier) wrote…
nit: if we're gonna leave these commented out, maybe add a bit more explanation about what we plan to do with these in the future.
I removed them, and instead added a to-do.
pkg/util/metric/histogram_buckets.go
line 44 at r1 (raw file):
Previously, abarganier (Alex Barganier) wrote…
nit: a bit more explanation about what that means/its affects would be helpful.
I actually think we might as well just take this precision testing stuff out and save it for the actual roachtest PR (especially if that work takes longer than expected to get merged).
pkg/util/metric/histogram_buckets.go
line 50 at r1 (raw file):
Previously, abarganier (Alex Barganier) wrote…
nit: for these and the other bucket configs, some comments converting these min/max values into units more.easily parsed (E.g.
ms
,s
,kb
, etc.) would be helpful for inspection/readability later on.
Done.
pkg/util/metric/histogram_buckets.go
line 116 at r1 (raw file):
Previously, abarganier (Alex Barganier) wrote…
nit: I think generally the pattern we take is assigning the results of this call to a package-level variable at initialization. Do we need it to be called each time
getBuckets
is called? It could lead to some inconsistent state if the env var value is changed while the bucket vars are still being initialized.
I actually think we might as well just take this precision testing stuff out and save it for the actual roachtest PR (especially if that work takes longer than expected to get merged).
But I'll follow the recommended pattern in that forthcoming PR.
pkg/util/metric/histogram_buckets.go
line 114 at r3 (raw file):
Previously, abarganier (Alex Barganier) wrote…
nit: can this function be defined on the
staticBucketConfig
type itself? Then we could callconfig.GetBuckets()
?
Done.
95bc8d9
to
0d24080
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for addressing the comments! Just one last thing re: the map usage, lmk what you think.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ericharmeling, @jmcarp, @msirek, @rhu713, and @samiskin)
pkg/ccl/sqlproxyccl/metrics.go
line 240 at r3 (raw file):
Previously, ericharmeling (Eric Harmeling) wrote…
don't we have these static buckets already assigned to public vars in the
metric
package?On master, we have exported vars for the static buckets (of type
[]float64
), to pass toNewHistogram
here. But we are now passing bucket configs toNewHistogram
.If the concern is with the performance of map lookups, we can use ints instead of strings as the keys for O(1) lookups? I went ahead and made that change, but happy to rethink that.
It wasn't so much about about the performance of the map lookups, but more the convenience/intuitiveness the API. Maybe I'm missing something, but what purpose does the map serve if its elements are essentially vars defined at initialization anyway?
Could we not just, for example, have BucketConfig: metric.IOLatencyBuckets
instead of BucketConfig: metric.StaticBucketConfigs[metric.IOLatencyBuckets]
here and elsewhere, removing the need for all these diffs?
I guess I'm not seeing the histType
type as necessary. Could each type not just be an exported package-level variable instead?
pkg/util/metric/histogram_buckets.go
line 44 at r1 (raw file):
Previously, ericharmeling (Eric Harmeling) wrote…
I actually think we might as well just take this precision testing stuff out and save it for the actual roachtest PR (especially if that work takes longer than expected to get merged).
SGTM!
pkg/util/metric/histogram_buckets.go
line 50 at r1 (raw file):
Previously, ericharmeling (Eric Harmeling) wrote…
Done.
Thanks!
pkg/util/metric/histogram_buckets.go
line 116 at r1 (raw file):
Previously, ericharmeling (Eric Harmeling) wrote…
I actually think we might as well just take this precision testing stuff out and save it for the actual roachtest PR (especially if that work takes longer than expected to get merged).
But I'll follow the recommended pattern in that forthcoming PR.
Sounds good, thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ericharmeling, @jmcarp, @msirek, @rhu713, and @samiskin)
pkg/ccl/sqlproxyccl/metrics.go
line 240 at r3 (raw file):
Previously, abarganier (Alex Barganier) wrote…
It wasn't so much about about the performance of the map lookups, but more the convenience/intuitiveness the API. Maybe I'm missing something, but what purpose does the map serve if its elements are essentially vars defined at initialization anyway?
Could we not just, for example, have
BucketConfig: metric.IOLatencyBuckets
instead ofBucketConfig: metric.StaticBucketConfigs[metric.IOLatencyBuckets]
here and elsewhere, removing the need for all these diffs?I guess I'm not seeing the
histType
type as necessary. Could each type not just be an exported package-level variable instead?
Also, we could leave the Buckets
member of metric.HistogramOptions
as the same name, instead of changing it to BucketConfig
, and completely remove the need for any of these diffs altogether. The type of Buckets
in the options should be enough to convey to the user that they are configs and not []float64
0d24080
to
166901c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TFTR! I've updated the PR following your latest feedback.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @abarganier, @jmcarp, @msirek, @rhu713, and @samiskin)
pkg/ccl/sqlproxyccl/metrics.go
line 240 at r3 (raw file):
It wasn't so much about about the performance of the map lookups, but more the convenience/intuitiveness the API. Maybe I'm missing something, but what purpose does the map serve if its elements are essentially vars defined at initialization anyway?
Ah okay. I think my main motivation for using a map for StaticBucketConfigs
here is to simplify the addition of new bucket types. When we introduce a new bucket type, we'd just add it to the map, and then it will automatically be part of the test loop in pkg/util/metric/histogram_buckets_test.go
. Declaring an explicit var for each would require us to declare an exported var for each, and then append each one individually to a slice that replaces the StaticBucketConfigs
map for the test loop. If you forget to append the exported var to the slice, then its generation could go untested.
On reflection though, this motivation is a bit weak, and I'm pretty ambivalent about it. I've updated the PR as suggested. See the latest commit.
I guess I'm not seeing the
histType
type as necessary. Could each type not just be an exported package-level variable instead?
Removed the type and exported each as a package-level var as suggested, in the latest commit.
Also, we could leave the
Buckets
member ofmetric.HistogramOptions
as the same name, instead of changing it toBucketConfig
, and completely remove the need for any of these diffs altogether. The type ofBuckets
in the options should be enough to convey to the user that they are configs and not[]float64
This PR doesn't remove the original Buckets []float64
from metric.HistogramOptions
, since we have some tests that use a custom (non-config-generated) bucket (see #107388 (comment)).
166901c
to
f9358b8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @ericharmeling, @jmcarp, @msirek, @rhu713, and @samiskin)
pkg/ccl/sqlproxyccl/metrics.go
line 240 at r3 (raw file):
This PR doesn't remove the original
Buckets []float64
frommetric.HistogramOptions
, since we have some tests that use a custom (non-config-generated) bucket (see #107388 (comment)).
Ack, nevermind on that one then.
This commit refactors histogram bucketing for legibility and composibility. It also introduces a data-driven test for histogram bucket generation. This refactor should make it easier to add additional metric categories, distributions, and bucket types. Part of cockroachdb#97144. Release note: None
f9358b8
to
c748c25
Compare
bors r+ |
Build succeeded: |
blathers backport 23.1 |
Encountered an error creating backports. Some common things that can go wrong:
You might need to create your backport manually using the backport tool. error creating merge commit from c748c25 to blathers/backport-release-23.1-107388: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict [] you may need to manually resolve merge conflicts with the backport tool. Backport to branch 23.1 failed. See errors above. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
This commit refactors histogram bucketing for legibility and composibility. It also introduces a data-driven test for histogram bucket generation.
This refactor should make it easier to add additional metric categories, distributions, and bucket types.
Part of #97144.
Release note: None