Metric (observer) collections should be independent from metric export #1432

Oberon00 · 2021-02-11T15:01:20Z

(the metric spec is still very unfinished, so this might already be included in one of the more general metric spec issues)

Consider this use case:

I want to collect JVM memory metrics (e.g. currently committed memory bytes) every 10 seconds
I want to export this (and all other metrics) aggregated (min/max/avg) every six collections i.e. every minute.

I think the spec currently assumes (or assumed before it was deleted) that metric collection and export happens in the same step, but this seems overly restrictive.

What is more, it would make sense to have each observer instrument run potentially on it's own schedule. E.g. some metrics might profit more from more granularity than others, some might be expensive to retrieve, others cheap.

(This paragraph might warrant it's own issue:) A good implementation of this would also ensure that collections whose intervals are a proper fraction or multiple of the export internval do not in-deterministically move between exports. For example, in the above JVM use case, an implementation that does not do any synchronization between export and collection will have a race between every 6th collection of metrics and export which are both supposed to happen at T=60. Such an implementation would sometimes send an aggregate of only 5 and then the next time up to 7 collections. This might be undesirable and should be avoidable without much performance overhead.

bogdandrutu · 2021-02-11T16:37:23Z

@Oberon00 who is "I" in this example "instrumentation author"? "application owner"?

Oberon00 · 2021-02-11T16:57:27Z

@bogdandrutu very good question actually 🤔 The two "I" might not even be the same. For collection interval, we are interested in:

How fluctuating is this metric actually (CPU usage vs number of loaded classes) -- Usually the instrumentation author has a good grasp on this, but the application owner might want to influence this e.g. to debug specific problems.
How expensive is it to collect the metric -- Usually the instrumentation author has a good estimate here too, but it might depend on the deployment situation, etc. (e.g. what if we are deployed to a machine with 256 cores -- collecting CPU usage might become unexpectedly expensive)
How interesting is this metric for the use case of the application owner

For export interval, the instrumentation author has no clue. I think this is more for the application owner to decide (that would be a typical environment variable setting). The backend's current/usual load factor could also be a factor here, as you can't export so often that you overwhelm it.

A recommended minimum export interval can also be calculated as the greatest common divisor of all collection intervals by the SDK which knows all registered observers. However, that would often be 1, if we do not provide some guidance on selecting the intervals.

One way might be to allow instrumentation authors only coarse control like HIGH_FREQUENCY, FREQUENT, NORMAL, INFREQUENT, SEMI_STATIC.

victlu · 2021-02-17T18:36:03Z

I think we have/want the ability to create multiple instances of a "pipeline" (aka SDK instance), so that each pipeline can be configure differently. This includes setting the SetPushIntervals (currently one setting for both collection and export), aggregators, exporters, etc... We should also allow pipelines to filter (include/exclude) for specific metric.

I think this issue maybe related to #1437 as well. We need ability to let each Library set it's "scope" so that it can be "attached" to a SDK / pipeline instance.

Oberon00 · 2021-02-17T18:41:55Z

I think we have/want the ability to create multiple instances of a "pipeline" (aka SDK instance), so that each pipeline can be configure differently

This is something different than this issue. In this issue I propose to decouple the first pipeline stage (collection) from the second (export).

jmacd · 2021-02-24T08:23:32Z

I know that @bogdandrutu himself has called for making the collection interval variable on a per-instrument basis. As you say @Oberon00, I believe this is about configuration in the "Accumulator", which flushes synchronous and asynchronous data into a processor and then exporter. I think you'll find 100% agreement from the group, it's just more work to spec out how this configuration should be done.

Oberon00 · 2021-02-24T08:29:05Z

I agree that it is more spec work, that's why I also filed #1433 😃

reyang · 2021-09-03T00:02:06Z

We've discussed this during the 9/2/2021 Metrics SIG meeting and decided to move this out of the initial metrics spec release scope.
This can be supported with #1888 (comment).

pellared · 2023-08-02T08:00:07Z

Some thoughts...

Having "collect + export" done at the same time covers most of the use cases and should be the behavior of a periodic exporting reader. I would avoid adding such behavior to a periodic exporting reader.

Could such functionality be provided via another reader implementation?
Maybe the SDK can provide an additional reader implementation which could handle this use case?
If so then, do we need to specify it?

Leonardo-Ferreira · 2024-04-30T16:26:25Z

Having "collect + export" done at the same time covers most of the use cases and should be the behavior of a periodic exporting reader.

I strongly disagree here. As an app owner, responsible for several critical applications subjected to request bursts, I can tell you that collecting CPU/MEM once a minute not good enough for any of my scenarios.

For the new applications, or applications that suffered changes in functionality recently or else, once a minute collection is too little because A LOT can happen in that 1 min... eg: my app reports ZERO http queue length but at the WAF I have several 503's because a burst of requests came in at a very unfortunate time and the APIM has a hard timeout of 10sec..

For the old applications, where things did not change anything in past 2 years or more and all else is stable, once a minute is basically a waste... you could capture 99% of the metrics once every 10min or more without any actual impact...

in my point of view, the person who says "this one-size-fits-all metrics collection every X amount of time for all my applications is ok" is not really hands-on involved in any kind of critical/relevant service... I've seen this behavior again and again in IT managers that will not have to actually troubleshoot anything ever again...

jack-berg · 2024-07-31T16:02:21Z

Is this issue a duplicate of #3617? If so, can we consolidate?

jsuereth · 2024-08-07T15:08:31Z

Duplicate of #3617

Oberon00 added area:sdk Related to the SDK spec:metrics Related to the specification/metrics directory labels Feb 11, 2021

Oberon00 mentioned this issue Feb 11, 2021

Consider removing Async Instruments (Observers) from first Metric version #1433

Closed

Oberon00 mentioned this issue Feb 15, 2021

Adding resource attributes post-creation (e.g. via auto-discovery) #1298

Open

reyang modified the milestones: Metrics API/SDK Stable Release, Metrics API/SDK Feature Freeze Mar 3, 2021

reyang self-assigned this Mar 3, 2021

reyang modified the milestones: Metrics API/SDK Feature Freeze, Metrics API/SDK Experimental Release May 6, 2021

reyang mentioned this issue May 18, 2021

Add details to Asynchronous Gauge API #1703

Merged

reyang mentioned this issue Aug 26, 2021

Add MetricReader interface #1888

Merged

reyang removed their assignment Sep 3, 2021

reyang removed this from the Metrics API/SDK Experimental Release milestone Sep 3, 2021

reyang added the release:after-ga Not required before GA release, and not going to work on before GA label Sep 3, 2021

reyang mentioned this issue Nov 15, 2021

Allow instruments to be collected on different intervals #2120

Closed

jack-berg mentioned this issue Dec 6, 2021

Configurable measuring interval by counter (instrument) #2200

Open

jack-berg mentioned this issue Apr 14, 2022

Add batch callback API open-telemetry/opentelemetry-java#4376

Merged

reyang mentioned this issue Jul 28, 2023

Do not require MetricReader to have ForceFlush #3609

Closed

pellared mentioned this issue Aug 2, 2023

Refine MetricProvider.ForceFlush and define ForceFlush for periodic exporting MetricReader #3563

Merged

dmarar mentioned this issue Dec 8, 2020

Fix to run the Auto instrumentation example in the docs open-telemetry/opentelemetry-python#1435

Merged

8 tasks

austinlparker added the triage:deciding:tc-inbox Needs attention from the TC in order to move forward label Jun 18, 2024

jack-berg mentioned this issue Jul 31, 2024

Metrics SDK: allow metric readers to filter Meters during Collect() #3617

Open

jsuereth marked this as a duplicate of #3617 Aug 7, 2024

jsuereth closed this as completed Aug 7, 2024

cijothomas mentioned this issue Sep 25, 2024

Frequency of Metrics report dotnet/runtime#108152

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Metric (observer) collections should be independent from metric export #1432

Metric (observer) collections should be independent from metric export #1432

Oberon00 commented Feb 11, 2021 •

edited

Loading

bogdandrutu commented Feb 11, 2021

Oberon00 commented Feb 11, 2021

victlu commented Feb 17, 2021

Oberon00 commented Feb 17, 2021

jmacd commented Feb 24, 2021

Oberon00 commented Feb 24, 2021

reyang commented Sep 3, 2021

pellared commented Aug 2, 2023 •

edited

Loading

Leonardo-Ferreira commented Apr 30, 2024

jack-berg commented Jul 31, 2024

jsuereth commented Aug 7, 2024

Metric (observer) collections should be independent from metric export #1432

Metric (observer) collections should be independent from metric export #1432

Comments

Oberon00 commented Feb 11, 2021 • edited Loading

bogdandrutu commented Feb 11, 2021

Oberon00 commented Feb 11, 2021

victlu commented Feb 17, 2021

Oberon00 commented Feb 17, 2021

jmacd commented Feb 24, 2021

Oberon00 commented Feb 24, 2021

reyang commented Sep 3, 2021

pellared commented Aug 2, 2023 • edited Loading

Leonardo-Ferreira commented Apr 30, 2024

jack-berg commented Jul 31, 2024

jsuereth commented Aug 7, 2024

Oberon00 commented Feb 11, 2021 •

edited

Loading

pellared commented Aug 2, 2023 •

edited

Loading