-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Metrics aggregation for same metric with 2 or more nodes #433
Comments
I guess I should have looked at this issue, it seems to be by design: But so I called the client label of my main2.go "cli2" and now I get alternating metrics for one and the other:
Again http://localhost:8889/metrics
So this is a bug right? |
Note that in my case it even is the same application, it just uses different labels but the result is at al times, only 1 value is in collector output |
I understand that OTel collector fills a role where you can imagine it aggregating host metrics together, but it requires a substantial redesign: it would turn the collector from a manager of pipelines into a manager of transient state. There is a proposal to add metrics transformation (open-telemetry/opentelemetry-collector-contrib#332) but it would not fill thus function because managing state is hard. I think it should be more difficult to experience the kind of trouble you're seeing, that metrics are overwritten because they appear to be identical. It suggests that (1) there are no host-specific resources, or (2) the Prometheus exporter is ignoring resources. In the OTel-Go export pipeline, I handled this problem instead of overwriting, but in that situation we have a state on every metric over an interval of time, which the collector does not. (In that situation, synchronous measurements are combined by merging, as the instrument semantics prescribe, and asynchronous measurements overwrite, as the instrument semantics prescribe: open-telemetry/opentelemetry-go#840.) I would like to propose that every OTel client automatically generate a resource label that distinguishes itself from similar processes. Then the collector will work as intended. Maybe there is already a suitable semantic convention, and we should specify that it MUST be included in the resource. |
I am currently observing this:
Result in send the new metric values only once. If is set it to sleep 0 seconds between them it seem to push the correct accumulated values and both metrics (with different labelsets):
It seems to only show the latest send metric in the prometheus exporter. |
@firefixmaarten This sounds like an OTel-Go issue already under investigation: open-telemetry/opentelemetry-go#875 Note that part of the problem being investigated there is a collector issue: #1255 |
Seems indeed to be the case, I will follow up on those issues. Thanks |
@bogdandrutu assigning this to you for triaging. |
* histogram aggregator draft * add tests for buckets * naming stuffs * docs * add tests for buckets * fix doc * update year * adds docs for Histogram * docs for boundaries. * addresses review comments Change to less-than buckets. Add offset checks. Unexport fields that don't need to be exported. Fix tests when running on profile with int64 number kind. * sort boundaries * remove testing field * fixes import order * remove print 🙈
Wouldn't it be nice, if the collector could also receive metrics from browser clients, e.g. angular applications, and merge them across all clients? This way, we could better understand what's going on/wrong for our user's. I understand this would require a shift in thinking, but for client/server architectures it would be worth the work. |
…y#433) Bumps [boto3](https://github.com/boto/boto3) from 1.17.80 to 1.17.82. - [Release notes](https://github.com/boto/boto3/releases) - [Changelog](https://github.com/boto/boto3/blob/develop/CHANGELOG.rst) - [Commits](boto/boto3@1.17.80...1.17.82) Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bug or feature?
Using the demo in example/demo runs fine.
I launched a second metrics-load-generator, hoping it would double the traffic and the metrics on http://localhost:8889/metrics, but instead it resets the metrics and the first metrics-load-generator's metrics stop being taken into account, as if the last Node arrived wins a metric.
Then I thought, "ok looks like the node name needs to be unique", so I used a main2.go with a different service name, and the same metrics as the other one concurrently running, and got the same result of "last one launched wins", the prometheus graph and /metrics is reset to 0 as the new metrics-load-generator is launched.
Is this normal by design? Why?
I am trying to find the best way to expose Prometheus metrics from a C++ Apache Module, being Apache it's multiprocess, so I was hoping the OpenTelemetry Agent would allow me to aggregate the metrics of the different processes. I saw that in php there is an intermediate unix-socket aggregator, but I don't think it exists as exporter for C++.
The opencensus-cpp metrics exporter for ocagent doesn't exist either, but since the Trace one exists, it's probably simple enough to code.
Thank you for your time.
Regards.
The text was updated successfully, but these errors were encountered: