-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle overlapping metrics from different jobs in prometheus exporter #1096
Conversation
|
Codecov Report
@@ Coverage Diff @@
## master #1096 +/- ##
==========================================
+ Coverage 86.43% 86.51% +0.07%
==========================================
Files 198 198
Lines 14172 14189 +17
==========================================
+ Hits 12250 12275 +25
+ Misses 1469 1464 -5
+ Partials 453 450 -3
Continue to review full report at Codecov.
|
Test failure looks unrelated? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test failure looks unrelated. Unfortunately, I do not have permissions to re-trigger the test. You could push an amended commit to re-trigger the test or the code owner merging the PR can re-trigger it. But changes LGTM.
Signed-off-by: Liam White <[email protected]>
Signed-off-by: Liam White <[email protected]>
Signed-off-by: Liam White <[email protected]>
Signed-off-by: Liam White <[email protected]>
Signed-off-by: Liam White <[email protected]>
Did a rebase 🤞. |
buf.WriteString(metric.GetMetricDescriptor().GetName()) | ||
labelKeys := metric.GetMetricDescriptor().GetLabelKeys() | ||
for _, labelKey := range labelKeys { | ||
buf.WriteString("-" + labelKey.Key) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are dashes allowed in metric names and label keys? If yes then this can result in the same metric signature for different metrics. The following 2 different metrics produce the same signature:
- Name: abc
Label key: d-ef
Signature: abc-d-ef - Name: abc-d
Label key: ef
Signature: abc-d-ef
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think its underscores only? This signature function is a copy-paste from the exporter. I guess it doesn't hurt to change it though, would you like me to do so?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doc seems to indicate dashes aren't allowed in metric names or labels in prometheus.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tigrannajaryan I think we're good to merge then 🙂
…open-telemetry#1096) This commit fixes the following bug: **The bug** if otel-collector/agent scrapes from two endpoints which emits metrics of the same name(even with different label), it seems to only export metrics from one of the endpoints in a random fashion. **Steps to reproduce** 1. modified the [demo example](https://github.com/open-telemetry/opentelemetry-collector/tree/master/examples/demo) in [this commit](jhengy@933679d) - create a cloned `metrics-load-generator2` which emits exactly the same metrics as `metrics-load-generator` except the `source` label 2. use [an older image](jhengy@bad29d9) (from 28 May 2020) due to the problem encountered with the latest otel-collector docker image, i.e. details can be found in [this issue](open-telemetry#1075) 3. run the modified demo example - `cd exaples/demo` - `docker-compose up` - `curl localhost:8889/metrics` **What did you expect to see?** Will see metrics from both metrics-load-generator(source=source1) and metrics-load-generator2( source=source2) **What did you see instead?** At anytime, only see metrics from one of the sources. Sometimes seeing this (only observe metrics from the metrics-load-generator service): ``` # HELP promexample_opdemo_latency The various latencies of the methods # TYPE promexample_opdemo_latency histogram promexample_opdemo_latency_bucket{client="cli",label1="value1",method="repl",source="source1",le="10"} 86 promexample_opdemo_latency_bucket{client="cli",label1="value1",method="repl",source="source1",le="50"} 448 promexample_opdemo_latency_bucket{client="cli",label1="value1",method="repl",source="source1",le="100"} 783 promexample_opdemo_latency_bucket{client="cli",label1="value1",method="repl",source="source1",le="200"} 802 promexample_opdemo_latency_bucket{client="cli",label1="value1",method="repl",source="source1",le="400"} 846 promexample_opdemo_latency_bucket{client="cli",label1="value1",method="repl",source="source1",le="800"} 939 promexample_opdemo_latency_bucket{client="cli",label1="value1",method="repl",source="source1",le="1000"} 973 promexample_opdemo_latency_bucket{client="cli",label1="value1",method="repl",source="source1",le="1400"} 1000 promexample_opdemo_latency_bucket{client="cli",label1="value1",method="repl",source="source1",le="2000"} 1007 promexample_opdemo_latency_bucket{client="cli",label1="value1",method="repl",source="source1",le="5000"} 1040 promexample_opdemo_latency_bucket{client="cli",label1="value1",method="repl",source="source1",le="10000"} 1088 promexample_opdemo_latency_bucket{client="cli",label1="value1",method="repl",source="source1",le="15000"} 1122 promexample_opdemo_latency_bucket{client="cli",label1="value1",method="repl",source="source1",le="+Inf"} 1132 promexample_opdemo_latency_sum{client="cli",label1="value1",method="repl",source="source1"} 1.2317093098059976e+06 promexample_opdemo_latency_count{client="cli",label1="value1",method="repl",source="source1"} 1132 # HELP promexample_opdemo_line_counts The counts of the lines in # TYPE promexample_opdemo_line_counts counter promexample_opdemo_line_counts{client="cli",label1="value1",method="repl",source="source1"} 3424 # HELP promexample_opdemo_line_lengths The lengths of the various lines in # TYPE promexample_opdemo_line_lengths histogram promexample_opdemo_line_lengths_bucket{client="cli",label1="value1",method="repl",source="source1",le="10"} 27 promexample_opdemo_line_lengths_bucket{client="cli",label1="value1",method="repl",source="source1",le="20"} 61 promexample_opdemo_line_lengths_bucket{client="cli",label1="value1",method="repl",source="source1",le="50"} 155 promexample_opdemo_line_lengths_bucket{client="cli",label1="value1",method="repl",source="source1",le="100"} 324 promexample_opdemo_line_lengths_bucket{client="cli",label1="value1",method="repl",source="source1",le="150"} 481 promexample_opdemo_line_lengths_bucket{client="cli",label1="value1",method="repl",source="source1",le="200"} 662 promexample_opdemo_line_lengths_bucket{client="cli",label1="value1",method="repl",source="source1",le="500"} 1669 promexample_opdemo_line_lengths_bucket{client="cli",label1="value1",method="repl",source="source1",le="800"} 2722 promexample_opdemo_line_lengths_bucket{client="cli",label1="value1",method="repl",source="source1",le="+Inf"} 3424 promexample_opdemo_line_lengths_sum{client="cli",label1="value1",method="repl",source="source1"} 1.7351559999999993e+06 promexample_opdemo_line_lengths_count{client="cli",label1="value1",method="repl",source="source1"} 3424 # HELP promexample_opdemo_process_counts The various counts # TYPE promexample_opdemo_process_counts counter promexample_opdemo_process_counts{client="cli",label1="value1",method="repl",source="source1"} 1132 ``` sometimes seeing this(only see metrics from the metrics-load-generator2 service): ``` # HELP promexample_opdemo_latency The various latencies of the methods # TYPE promexample_opdemo_latency histogram promexample_opdemo_latency_bucket{client="cli",label1="value1",method="repl",source="source2",le="10"} 100 promexample_opdemo_latency_bucket{client="cli",label1="value1",method="repl",source="source2",le="50"} 526 promexample_opdemo_latency_bucket{client="cli",label1="value1",method="repl",source="source2",le="100"} 937 promexample_opdemo_latency_bucket{client="cli",label1="value1",method="repl",source="source2",le="200"} 960 promexample_opdemo_latency_bucket{client="cli",label1="value1",method="repl",source="source2",le="400"} 1013 promexample_opdemo_latency_bucket{client="cli",label1="value1",method="repl",source="source2",le="800"} 1122 promexample_opdemo_latency_bucket{client="cli",label1="value1",method="repl",source="source2",le="1000"} 1171 promexample_opdemo_latency_bucket{client="cli",label1="value1",method="repl",source="source2",le="1400"} 1206 promexample_opdemo_latency_bucket{client="cli",label1="value1",method="repl",source="source2",le="2000"} 1214 promexample_opdemo_latency_bucket{client="cli",label1="value1",method="repl",source="source2",le="5000"} 1257 promexample_opdemo_latency_bucket{client="cli",label1="value1",method="repl",source="source2",le="10000"} 1308 promexample_opdemo_latency_bucket{client="cli",label1="value1",method="repl",source="source2",le="15000"} 1343 promexample_opdemo_latency_bucket{client="cli",label1="value1",method="repl",source="source2",le="+Inf"} 1352 promexample_opdemo_latency_sum{client="cli",label1="value1",method="repl",source="source2"} 1.3510892105500018e+06 promexample_opdemo_latency_count{client="cli",label1="value1",method="repl",source="source2"} 1352 # HELP promexample_opdemo_line_counts The counts of the lines in # TYPE promexample_opdemo_line_counts counter promexample_opdemo_line_counts{client="cli",label1="value1",method="repl",source="source2"} 4113 # HELP promexample_opdemo_line_lengths The lengths of the various lines in # TYPE promexample_opdemo_line_lengths histogram promexample_opdemo_line_lengths_bucket{client="cli",label1="value1",method="repl",source="source2",le="10"} 38 promexample_opdemo_line_lengths_bucket{client="cli",label1="value1",method="repl",source="source2",le="20"} 92 promexample_opdemo_line_lengths_bucket{client="cli",label1="value1",method="repl",source="source2",le="50"} 211 promexample_opdemo_line_lengths_bucket{client="cli",label1="value1",method="repl",source="source2",le="100"} 419 promexample_opdemo_line_lengths_bucket{client="cli",label1="value1",method="repl",source="source2",le="150"} 626 promexample_opdemo_line_lengths_bucket{client="cli",label1="value1",method="repl",source="source2",le="200"} 814 promexample_opdemo_line_lengths_bucket{client="cli",label1="value1",method="repl",source="source2",le="500"} 2025 promexample_opdemo_line_lengths_bucket{client="cli",label1="value1",method="repl",source="source2",le="800"} 3270 promexample_opdemo_line_lengths_bucket{client="cli",label1="value1",method="repl",source="source2",le="+Inf"} 4113 promexample_opdemo_line_lengths_sum{client="cli",label1="value1",method="repl",source="source2"} 2.0698130000000026e+06 promexample_opdemo_line_lengths_count{client="cli",label1="value1",method="repl",source="source2"} 4113 # HELP promexample_opdemo_process_counts The various counts # TYPE promexample_opdemo_process_counts counter promexample_opdemo_process_counts{client="cli",label1="value1",method="repl",source="source2"} 1352 ```
…/zipkin (open-telemetry#1096) * Bump google.golang.org/grpc in /exporters/trace/zipkin Bumps [google.golang.org/grpc](https://github.com/grpc/grpc-go) from 1.31.0 to 1.31.1. - [Release notes](https://github.com/grpc/grpc-go/releases) - [Commits](grpc/grpc-go@v1.31.0...v1.31.1) Signed-off-by: dependabot[bot] <[email protected]> * Auto-fix go.sum changes in dependent modules Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: dependabot[bot] <dependabot[bot]@users.noreply.github.com> Co-authored-by: Tyler Yahn <[email protected]>
* fix(opentelemetry-operator): allow fullnameOverride Signed-off-by: skuethe <[email protected]> * docs(opentelemetry-operator): bump to new Chart version Signed-off-by: skuethe <[email protected]> * chore(opentelemetry-operator): bump version Signed-off-by: skuethe <[email protected]> --------- Signed-off-by: skuethe <[email protected]> Co-authored-by: Tyler Helmuth <[email protected]>
Resolves #1076.
Signed-off-by: Liam White [email protected]