-
Notifications
You must be signed in to change notification settings - Fork 658
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prometheus exporter - Histograms are disappearing. #3089
Comments
I was able to duplicate this, and I also found this related discussion: open-telemetry/opentelemetry-collector-contrib#13443 It appears to happen when there is no data collected for the histogram during an exporter interval. Here's a minimal from fastapi import FastAPI
from opentelemetry.exporter.otlp.proto.grpc.metric_exporter import OTLPMetricExporter
from opentelemetry.instrumentation.fastapi import FastAPIInstrumentor
from opentelemetry.metrics import set_meter_provider
from opentelemetry.sdk.metrics import MeterProvider
from opentelemetry.sdk.metrics.export import (
ConsoleMetricExporter,
PeriodicExportingMetricReader,
)
from opentelemetry.sdk.resources import SERVICE_NAME, Resource
app = FastAPI()
@app.get("/")
def read_root():
return {"Hello": "World"}
exporter = OTLPMetricExporter(endpoint="http://otel-collector:4317")
console = ConsoleMetricExporter()
reader = PeriodicExportingMetricReader(exporter, export_interval_millis=5_000)
provider = MeterProvider(
resource=Resource.create({SERVICE_NAME: "fastapi-test-app"}),
metric_readers=[reader],
)
set_meter_provider(provider)
FastAPIInstrumentor.instrument_app(app) and receivers:
otlp:
protocols:
grpc:
exporters:
prometheus:
endpoint: 0.0.0.0:8889
extensions:
health_check:
service:
extensions: [health_check]
pipelines:
metrics:
receivers: [otlp]
processors: []
exporters: [prometheus] Running this app the first metric is exported after the first web request successfully. After that every export will emit the error message and traceback every 5 seconds:
|
Hey folks, thanks for raising this issue and debugging. Would one of you be willing to send a PR to fix this? Longer term, I'm wondering if we should move away from using the Prometheus client library altogether and just generate prometheus text format on our own. This is what the JS SIG went with. |
My test showed this failing with OTLP/grpc export to an opentelemetry collector exporting prometheus, not using the built in prometheus exporter. So it may be that there are two separate issues or that the underlying cause is deeper in the metrics generation rather than inside the exporter. In the related discussion they called out the duplicate |
I think I isolated it a little further. In my testing this only occurs if there is a period where a previously-active histogram has no data points & and there is a different metric with data. I dumped some data from the OTLP exporter before it went over the wire. When there is data in the histogram, the request looks like this:
When there is no data, the output looks like this:
So it's not just that I don't know why this only causes an error when the collector receives another metric, but it does seem like if there are no data points the Line 182 in 0f9cfdd
I have no idea if that's a valid solution, but wanted to share what I learned. |
Any update on this? |
@robotadam how are you running that script to produce histogram data points? |
@ocelotl I ran it with
docker-compose.yml:
the |
@ocelotl I have some spare time and a high interest into getting this fixed. If I come up with a fix, would you be able to review or will it have to go through a different process? |
@ocelotl to clarify - will this be addressed via #3407 (based on the discussions around #3277 (comment))? |
@crisog I would surely, but it may be a fix already in #3429. @robotadam @mshebeko-twist @bastbu could you try again using the fix in #3429? |
@ocelotl can confirm, the fix solves the issue for me. |
Describe your environment Describe any aspect of your environment relevant to the problem, including your Python version,
python 3.7.10
opentelemetry_api-1.14.0
opentelemetry_sdk-1.14.0
opentelemetry_exporter_prometheus-1.12.0rc1
Steps to reproduce
Create Histogram and expose it with prometheus-exporter
What is the expected behavior?
Histograms should consistently appear on /metrics endpoint
What is the actual behavior?
Histograms disappears from /metrics endpoint after request/scrape. A change to metrics (in my case would be query to application itself) will make histogram appear again until next request/scrape. This is not happening neither in prometheus_client nor in javascript opentelemetry prometheus exporter.
Additional context
This behaviour is not optimal when multiple prometheus servers are scrapping metrics.
I have both auto instrumented and manually instrumented histograms both of them are disappearing.
Edit:
Don't know if its actually related but I hope it may be helpful.
I have additional observation. When pushing Histograms to OTEL Collector I sporadically encounter following error:
But its not the case with JS implementation its just works without any errors.
The text was updated successfully, but these errors were encountered: