Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metrics are all gauges, which is incorrect #105

Closed
ctoestreich opened this issue Mar 12, 2020 · 5 comments
Closed

Metrics are all gauges, which is incorrect #105

ctoestreich opened this issue Mar 12, 2020 · 5 comments

Comments

@ctoestreich
Copy link

ctoestreich commented Mar 12, 2020

It appears that the creation of metrics is incorrectly treating them as gauges. While this works to view metrics, it is treated a bit oddly in systems like DataDog where gauges are values that will go up and down where a count is a monotonically incrementing number. When viewing a metric like kafka.consumer.records-consumed-total, since it is a gauge, it is not treated as count or a rate and the count over time is not shown, just last reported gauge value.

Screen Shot 2020-03-12 at 9 34 40 AM

https://micrometer.io/docs/concepts#_gauges

When looking here there are varying types of metrics that are registered for different metrics https://github.com/micrometer-metrics/micrometer/blob/master/micrometer-core/src/main/java/io/micrometer/core/instrument/binder/kafka/KafkaConsumerMetrics.java

I am still doing research into the code that we wrote a while back to see how much control we have over this and how to appropriately determine the type of meter to register.

For spring cloud streams the topic lag is definitely a gauge and that makes sense here: https://github.com/spring-cloud/spring-cloud-stream-binder-kafka/blob/master/spring-cloud-stream-binder-kafka/src/main/java/org/springframework/cloud/stream/binder/kafka/KafkaBinderMetrics.java#L106

Screen Shot 2020-03-12 at 10 22 10 AM

When kafka registers "metrics" they then expect you to create a meter for the metric and they don't seem to indicate what "type" of meter they intend you to create. I don't see an auto-magical type inference here expect for doing some sort of regex or explicit name<->type registry in micronaut. Similar to how the types are explicit here: https://github.com/micrometer-metrics/micrometer/blob/master/micrometer-core/src/main/java/io/micrometer/core/instrument/binder/kafka/KafkaConsumerMetrics.java#L93

Screen Shot 2020-03-12 at 10 28 04 AM

@ctoestreich
Copy link
Author

@graemerocher I created a similar approach to the core micrometer creation. I am still not sure even those are 100% correct, but it is closer. Before merging #106 I will do some local testing to see if the data is produced into a time series collectors more appropriately.

@graemerocher
Copy link
Contributor

Will take a look

@ctoestreich
Copy link
Author

ctoestreich commented Mar 14, 2020

This change appears to be working as intended where the count is now visualized as a last reported value (diff) as opposed to a gauge (constant)

Current Gauge

Screen Shot 2020-03-14 at 12 35 14 PM

With Change To Count Type

Screen Shot 2020-03-14 at 12 35 28 PM

@ctoestreich
Copy link
Author

ctoestreich commented Mar 14, 2020

To further add clarity, when summing the gauge metric over a window you get the metric incorrectly added for every window slice where if there was 2500 messages reported once, it stays 2500 so it throws off avg and you can't really calculate sum/window because you add up 2500 (n) times giving a nonsensical metric.

The graph below when treated as a count type shows that 1 came in then 0 for every other reporting window so you can sum the window to get an accurate real count along with a rate if so needed.

@graemerocher
Copy link
Contributor

Thanks for the contribution

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants