Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support explicit metrics timestamp #5272

Closed
ThePlenkov opened this issue Dec 16, 2024 · 15 comments
Closed

Support explicit metrics timestamp #5272

ThePlenkov opened this issue Dec 16, 2024 · 15 comments
Labels
feature-request pkg:api triage:rejected This feature has been rejected

Comments

@ThePlenkov
Copy link

Context

We use JS OTel API as the exporter for the metrics. Those metrics are generated with the alternative runtime which has no OTel SDK available. We propagate metrics as dedicated events. Our proxy service is supposed to consume those events and produce OTel events using SDK.

The problem

As we can see it in a code

 this._writableMetricStorage.record(
      value,
      attributes,
      context,
      millisToHrTime(Date.now())
    );

we fill metrics timestamp with Date.now( ) always, but in our case the message with a metric comes from outside. So we would like to propagate this metric with a given timestamp.

Desired solution

Methods add/record of metrics should support optional timestamp(or may be even start/end time) parameter, that's how we can set the the timestamp explicitly.

Describe alternatives you've considered

Alternative is to use a hacky solution like this:
image

@pichlermarc
Copy link
Member

Hi @ThePlenkov, thanks for reaching out.

The OpenTelemetry project implements the OpenTelemetry specification. Since this is a feature request for the API which is a specified component we cannot accept this feature unless it is specified.

I think what you're looking for to address you use-case is implementing a MetricProducer and passing that as a constructor option to a MetricReader

(marking this as rejected since it's missing specification)

@pichlermarc pichlermarc closed this as not planned Won't fix, can't repro, duplicate, stale Dec 16, 2024
@ThePlenkov
Copy link
Author

@pichlermarc But, what is the conflict between the spec and the current proposal? https://opentelemetry.io/docs/specs/otel/metrics/data-model/#gauge let's take one of the example metrics like gauge.

It says that datapoint may contain timestamp:

(optional) A timestamp (start_time_unix_nano) which best represents the first possible moment a measurement could be recorded. This is commonly set to the timestamp when a metric collection system started.

Now let's assume something like this: we know that metric collection started earlier that the moment we record a value into the metric with SDK - what is wrong here if we just give this value?

I don't see a conflict.

@ThePlenkov
Copy link
Author

Created follow-up discussion here: open-telemetry/opentelemetry-specification#4342

@pichlermarc
Copy link
Member

that's the specification for the Data Model, which is not what's asked to be modified here.
The metrics part of the @opentelemetry/api package implements this specification.

@pichlermarc
Copy link
Member

pichlermarc commented Dec 17, 2024

Apologies if I came across a bit harsh yesterday - It's the is the kind of thing that we cannot/should not decide on our own and such a feature has massive implications downstream of the API the SDK implementation.

Currently our SDK implementation relies on timing being like that and my guess is many other SDK implementations do too, since suddenly there's metrics from the past that can be recorded this can cause problems. It works fine for delta aggregations, but it is (I think) it's impossible to do for cumulative aggregations, which is why this was a clear rejection reason for me.

@ThePlenkov
Copy link
Author

@pichlermarc no but you are right, we need to make it part of the spec of course. I think otel sdk is supposed to serve for different purposes. Indeed main telemetry pipeline is written in Go language- but implements same spec, right? So for example in open telemetry we have some plugins which allow to import OTLP json events into pipeline. But now is a question- is it possible to take any OTLP metric message and convert it to telemetry using SDK? What I wanted to highlight that is only partially possible. So I hope to start some discussion about this at least

@pichlermarc
Copy link
Member

pichlermarc commented Dec 17, 2024

The API for OTel looks very similar for 7+ languages, yes. That's since all of them implement the same specification. We usually don't deviate from it since a later specification version may address this use-case and then we'd end up with the same feature twice.

is it possible to take any OTLP metric message and convert it to telemetry using SDK? What I wanted to highlight that is only partially possible. So I hope to start some discussion about this at least

There's no such functionality to turn OTLP back to an internal representation, no. As mentioned eariler, there is the MetricProducer interface (spec, JavaScript interface) but one has to re-aggregate metrics themselves. That is the specified workaround for the OpenCensus Shim which ran into similar limitations IIRC (their metrics are already pre-aggregated so it could not be mapped 1:1 to API calls) - so my guess is that the Specification SIGs recommendation will be to do the same for your use-case.

With MetricProducer one can at least re-use the Exporter and MetricReader implementations, but re-aggregation logic has to be implemented separately, if needed.

@ThePlenkov
Copy link
Author

OK I see. it explains now why metrics api is different from same traces https://opentelemetry.io/docs/specs/otel/trace/api/#span-creation or logs https://opentelemetry.io/docs/specs/otel/logs/data-model/#field-timestamp which both support custom timestamps and allow to generate telemetry data "from the past".

Do you know - is there a way to delegate this re-aggregation logic to the collector pipeline, that's how our process will just stream metrics as they are recorded one to one?

Thanks!

@pichlermarc
Copy link
Member

I think this is the part that has to be implemented individually today.

There's this issue which comes close but it seems to ask that feature for cumulative metrics only but it's not implemented yet.

@ThePlenkov
Copy link
Author

Indeed - that could be a good starting point. Because if we could somehow force the library not to cumulate metrics, then it's not dangerous anymore to provide metrics timestamp, because it will be just exported to the pipeline "as is"

However, I think i just found the alternative way to do this. By use of
https://open-telemetry.github.io/opentelemetry-js/interfaces/_opentelemetry_sdk_metrics.PushMetricExporter.html#export

So if we create an own exporter - then we can export there resource and scope metrics directly.

Do you think it may work?

@pichlermarc
Copy link
Member

pichlermarc commented Dec 17, 2024

Because if we could somehow force the library not to cumulate metrics, then it's not dangerous anymore to provide metrics timestamp, because it will be just exported to the pipeline "as is"

Haha, I think we may be talking past each other here 😅

I think I stand by the initial assessment that if you're exporting metrics as seperate events, you can have a receiver that takes those metrics, passes them to a custom MetricProducer, that MetricProducer re-aggregates them if needed (need to be careful with memory, but that's doable), and then let metrics be collected through a PeriodicExportingMetricReader that has a producer and OTLP Exporter associated with it.

However, I think i just found the alternative way to do this. By use of
https://open-telemetry.github.io/opentelemetry-js/interfaces/_opentelemetry_sdk_metrics.PushMetricExporter.html#export

So if we create an own exporter - then we can export there resource and scope metrics directly.

Do you think it may work?

IIUC, then yes, you can use an exporter like that - but I believe at this point it's easier to just do what I wrote above, then you don't need to re-implement an exporter. But be aware of the overhead. OTLP endpoints usually have a payload size limit to protect the infrastructure of the receiver. If you export every metric un-aggregated, you may end up with a huge OTLP message that will likely be rejected by the server you're sending it to.

@ThePlenkov
Copy link
Author

Kind of. The problem is - I still do not understand how I can use MetricProducer . It has the only method collect but this method doesn't accept any other inputs except timeout. Do you mean - I still need to use SDK to produce metrics, then collect them from MetricProducer and then export manually via exporter?

@ThePlenkov
Copy link
Author

oh wait - now I get it, you probably mean that I need to have an own MetricProducer ( like extended from a strandard one ) and then there modify timestamps already in the collected metrics?

@pichlermarc
Copy link
Member

pichlermarc commented Dec 18, 2024

Yes, MetricProducer is a user-implementable interface. MetricProducer#collect() is called by PeriodicExportingMetricReader (the MetricProducer can be passed to the constructor), and then passed to a PushMetricExporter (like the OTLP exporters, for example).

@ThePlenkov
Copy link
Author

it worked, thanks! We inherited from PeriodicExportingMetricReader and redefined collect method. So it helped to change timestamps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature-request pkg:api triage:rejected This feature has been rejected
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants