-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use upstream Prometheus client and Spring integration #1609
Conversation
The Spring counter metrics don't become, and shouldn't become Prometheus counter metrics, since in Spring Boot, counters can decrease. Current practice is to just make metrics like this We should also see more JVM metrics exposed, in a way more Prometheus friendly way, letting up improve the dashboard further. |
<scope>test</scope> | ||
<groupId>io.prometheus</groupId> | ||
<artifactId>simpleclient</artifactId> | ||
<version>0.0.23</version> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems ok to have an unstable dep here because the server-side dependency is optional (which makes it a part of the all-jar, but not a strict dep for customizers)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I don't speak fluent Maven. By unstable dep, do you mean we shouldn't pin the version of io.prometheus:simpleclient? Or that I shouldn't remove org.springframework:spring-test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unstable I meant the patch-only version number :)
nit: maybe use a property to coordinate these? ex.
in the properties section
<simpleclient.version>0.0.23</simpleclient.version>
then in places like here:
${simpleclient.version}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see.
Yep, will set up a property for the version number.
might be worth making some sample traffic against this to ensure that collector metrics show up (they are probably the most important metric). ex run one of the example projects like https://github.com/openzipkin/pyramid_zipkin-example |
The upstream client exposes all the Spring Boot Actuator metrics, in almost exactly the same way as the previous Zipkin implementation. Still, due diligence is due, I'll check. |
Metrics after sending a few spans using Methinks we're fine:
Oh! Then I guess I should add them to the Grafana dashboard. |
they are probably the most important metric
Oh! Then I guess I should add them to the Grafana dashboard!
yes I had a nag in the back of my head to nag you on the other issue. By
naming convention substitute http for kafka (sqs scribe) etc as not
everyone sends spans via http transport.
messages, spans and bytes are interesting metrics, especially if
coordinated client-side. drop metrics are something that one would alert on
at some point.
|
Important note, just realized: this change does actually break the currently used metric names. The Zipkin implementation trims the Assumption: not a lot of people are currently using these metrics. Proposal: let's drop as much custom logic as we can, and break the metric names now, before publishing the dashboard using the metric names. |
Assumption: not a lot of people are currently using these metrics.
Proposal: let's drop as much custom logic as we can, and break the metric
names now, before publishing the dashboard using the metric names.
I'd look at changelog for the directory and ping any implicated. After that
go for it
…On Mon, Jun 12, 2017 at 5:40 PM, Zoltán Nagy ***@***.***> wrote:
Important note, just realized: this change does actually break the
currently used metric names. The Zipkin implementation trims the counter_
and gauge_ prefixes (and sets the Prometheus metric type accordingly),
while the upstream one does *not* trim those prefixes, and makes
everything a gauge in Prometheus (see #1609 (comment)
<#1609 (comment)>).
Assumption: not a lot of people are currently using these metrics.
Proposal: let's drop as much custom logic as we can, and break the metric
names now, before publishing the dashboard using the metric names.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#1609 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAD61xuKt-aXNt6rHIwJ_eeNonZcx50Tks5sDQeZgaJpZM4N2F3H>
.
|
@klette Having implemented the original Zipkin Prometheus exporter, what are your thoughts on this PR? Especially #1609 (comment)? (That exhausts the list of people who contributed to the Prometheus exporter 😄 ) |
@@ -27,6 +27,7 @@ | |||
|
|||
<properties> | |||
<main.basedir>${project.basedir}/../..</main.basedir> | |||
<prometheus_simpleclient.version>0.0.23</prometheus_simpleclient.version> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the fix to the aforementioned bug is in master now, so guessing 0.0.24 will sort it out
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
0.0.26 is out, so probably time to see if we're sorted?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😨 Sad response time is sad. Yep, I'll check it out today.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Confirmed, Prometheus can successfully scrape data with 0.0.26
. Pushed a commit; I'll start updating the Grafana dashboard for the new metric names, upload once this is merged, then we can hopefully also merge openzipkin-attic/docker-zipkin#135
@klette nagtime! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will merge after testing against dashboard
1f91ddc
to
0104dde
Compare
gracias! |
This change replaces our custom logic for exposing metrics with the implementation in the upstream Prometheus client, as noted in #1144 (comment). The notable improvement (other than less code) is the addition of histogram data of response times, using a servlet filter (the data looks something like this: https://gist.github.com/abesto/46aeb18ab45e5126ccebc515eb3fc99f)
Metrics comparison: before / after
There were two issues that blocked us from merging this for a while:
/prometheus
from a browser just fine)counter.
andgauge.
prefixes to figure out the right metric type, always defaults togauge
. Spring Boot counter metrics are now counters in Prometheus prometheus/client_java#254 aims to fix this.