Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

StackdriverExporter is broken #490

Closed
ocervell opened this issue Jul 22, 2020 · 5 comments
Closed

StackdriverExporter is broken #490

ocervell opened this issue Jul 22, 2020 · 5 comments
Labels
bug Something isn't working

Comments

@ocervell
Copy link

ocervell commented Jul 22, 2020

Describe the bug
Deployed on Kubernetes using the following setup:
[OT SDK + OpenCensusMetricsExporter] --> [OT Collector + Cloud Monitoring Exporter] --> Cloud Monitoring API

Steps to reproduce
https://github.com/ocervell/gunicorn-opentelemetry-poc
Deploy the custom-metrics-example/ and the OT agent in ops/opentelemetry on GKE. No need to deploy the whole Flask application.

What did you expect to see?
Timeseries populated in Cloud Monitoring UI.

What did you see instead?
The metric descriptor is created correctly in Cloud Monitoring API, but there are errors while writing timeseries to Cloud Monitoring API:

Field timeSeries[0].points[0].interval.start_time had an invalid value of "2020-07-22T09:46:15.328184-07:00": The start time must be before the end time (2020-07-22T09:46:15.328184-07:00) for the non-gauge metric 'custom.googleapis.com/opencensus/custom_metric_example'.

What version did you use?
v0.6.0

What config did you use?
https://github.com/ocervell/gunicorn-opentelemetry-poc/blob/master/ops/opentelemetry/ot-agent.yaml

Environment
GKE

Additional context
It seems like certain API calls for writing timeseries are going through, but there is no data in Cloud Monitoring Metrics Explorer.
I tried adding a batch processor but was running into another issue.

@ocervell ocervell added the bug Something isn't working label Jul 22, 2020
mxiamxia referenced this issue in mxiamxia/opentelemetry-collector-contrib Jul 22, 2020
Fixes #445, #158 

This PR addresses some Jaeger receiver config cleanup as well as makes some breaking changes to the way the config is handled.   See below for details.

**Fixes/Updates**
- Disabled flag is respected per protocol
- Unspecified protocols will no longer be started
- Empty protocol configs can now be specified to start the protocol with defaults. e.g.
  ```
  jaeger:
    protocols:
      grpc:
  ```
- Updated readmes
- Naming and behavior of per protocol Addr/Enabled functions in `trace_reciever.go` has been standardized.
- Added thrift tchannel test to meet code coverage

**Breaking Change**
Changed the way an empty `jaeger:` config is handled.  An empty/default config does not start any jaeger protocols.  Previously it started all three collector protocols.  This is a consequence of not starting unspecified protocols.
@aabmass
Copy link
Member

aabmass commented Jul 23, 2020

@ocervell, I took a look at the Python OpenCensusMetricsExporter and I think the bug may lie there. It is not setting the start_timestamp here. I think that is why your start and end time are the same in the bug report. I will share a fix you can try in a second 🙂

@aabmass
Copy link
Member

aabmass commented Jul 23, 2020

@ocervell can you try using the package from my branch https://github.com/aabmass/opentelemetry-python/tree/fix-oc-exporter and see if that fixes it? It will log the whole metric proto it is sending for debugging, including the new start_timestamp.

index 840e74b..647274f 100644
--- a/custom-metrics-example/requirements.txt
+++ b/custom-metrics-example/requirements.txt
@@ -1,3 +1,3 @@
 opentelemetry-api
 opentelemetry-sdk
-opentelemetry-ext-opencensusexporter
+-e git+https://github.com/aabmass/opentelemetry-python.git@fix-oc-exporter#egg=opentelemetry-ext-opencensusexporter&subdirectory=ext/opentelemetry-ext-opencensusexporter

@ocervell
Copy link
Author

ocervell commented Jul 24, 2020

Sure, will try this and let you know ! Thanks for finding the problem 👍

@ocervell
Copy link
Author

ocervell commented Jul 24, 2020

@aabmass it's working with the custom-metrics-example now ! Your fix is working, no more API errors in writing timeseries to Cloud Monitoring.

In the Flask context though, the OpenCensusMetricsExporter seems to not be sending any metrics to the collector (can't see them in the collector debug logs). When I drop in the CloudMonitoringMetricsExporter instead (see here) it works correctly.

@aabmass
Copy link
Member

aabmass commented Jul 24, 2020

@ocervell I took a look, the issue is the pid label you are passing is an int, but the exporter is expecting strings (also don't think you need pid at all if you are using the unique identifier option). The OT python SDK is also only expecting strings for label values, but I do think it is reasonable for the exporter to try and do this conversion for you, if not the SDK.

Since these are all python bugs, can you open an issue in https://github.com/open-telemetry/opentelemetry-python and tag me, then close this one?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants