Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metrics SDK: Accumulator and Processor state requirements #1198

Closed
wants to merge 7 commits into from
Closed
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
78 changes: 76 additions & 2 deletions specification/metrics/sdk.md
Original file line number Diff line number Diff line change
Expand Up @@ -250,6 +250,14 @@ synchronous instrument updates. The Accumulator SHOULD NOT hold an
exclusive lock while calling an Aggregator (see below), since some
Aggregators may have higher concurrency expectations.

State managed in the Accumulator MUST be transient. This requirement
ensures that export pipelines written constructed for stateless
exporters (e.g. Statsd, OTLP with a stateless ExportKindSelector) are
not penalized by permanent state in the Accumulator. This implies
that the use of long-term state in a Metrics export pipeline should be
elective, and such state if present should be managed in the Processor
component.
jmacd marked this conversation as resolved.
Show resolved Hide resolved

#### Accumulator: Collect() function

The Accumulator MUST implement a Collect method that builds and
Expand All @@ -268,9 +276,75 @@ Label Set, Resource, and metric Descriptor.

TODO: _Are there more Accumulator functional requirements?_

### Processor
### Processor: Component interface

The Processor component interface supports interaction from the
Controller and Accumulator. The Controller, responsible for
initiating a new round of collection, informs the Processor when a
collection interval starts and finishes. After finishing the
collection interval, the Controller gets the ExportRecordSet before
calling the Exporter to export data.

The Accumulator interacts with the Processor during the call to its
`Collect()` method, during which it calls `Process()` once per
ExportRecord on the Processor.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason why the Controller doesn't mediate this? Couldn't the controller ask the accumulator for the data, then pass it to the Processor, rather than have the Accumulator know about the processor directly? What do we gain from the extra coupling between Accumulator and Processor?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Controller does mediate this--it's the exclusive "owner" of the pipeline and mediates everything, I'd say. Why ask the Accumulator to pass data to the Processor? The Processor has lots of flexibility this way, e.g., to simply ignore Accumulations.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My question is...why not have the controller pass this data to the processor, and decouple the accumulator from the processor?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After thinking about this, I guess there's no reason to require the Accumulator to call the Processor directly. The Processor could instead implement an iterator pattern, to let the Processor scan through a collection representing the Accumulator state.

Is that a better design? I'm not sure the code will be cleaner for me in Go--if I'm to avoid allocating an intermediate data structure--but it sounds logically cleaner.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll ponder in my nascent java implementation what the appropriate degree of coupling might be. @breedx-splk this is something we should think about.


The Processor component is meant to be used for managing long-term
state; it is also one of the locations in the Metrics export pipeline
where we can impemlement control over cardinality. There are two
reasons that long-term state is typically required in a Metric export
pipeline:

1. Because the Exporter requests Cumulative aggregation temporality for Sum and/or Histogram data points
2. Because the Exporter requests keeping Memory about all metric label sets, regardless of the requested aggregation temporality.

Note that both of these behaviors are typically required for a
Prometheus exporter and that when none of these behaviors are
configured, the Metrics export pipeline can be expected not to develop
long-term state.

#### Basic Processor

The basic Processor supports two standard ExportKindSelectors and the
independent Memory behavior described above. The default
OpenTelemetry Metrics SDK MUST provide a basic Processor meeting these
requirements.

##### Basic Processor: CumulativeExportKindSelector

CumulativeExportKindSelector is the default behavior, which requests
exporting Cumulative aggregation temporality for Sums and Histograms
and implies that label sets used with synchronous instruments will be
remembered indefinitely in the SDK. This ExportKindSelector is the
default in order support downstream Prometheus exporters "out of the
box".

##### Basic Processor: StatelessExportKindSelector

The StatelessExportKindSelector configures a Metric export pipeline
with no long-term memory requirements. In this selector, the Counter,
UpDownCounter, ValueRecorder, and ValueObserver instruments are
configured for Delta aggregation temporality while SumObserver and
UpDownSumObserver instruments are configured for Cumulative
aggregation temporality. This basic Processor configuration has no
long-term memory requirements because each instrument is
jmacd marked this conversation as resolved.
Show resolved Hide resolved
passed-through without any conversion.

##### Basic Processor: Memory

Some metrics exporter configurations request that the Metric export
pipeline maintain long-term state about historically reported Metric
timeseries. This option is a simple boolean that, when set, requires
the Processor to retain memory about all timeseries it has ever
exported. This option is only meaningful when reporting Cumulative
aggregation temporality.

#### Reducing Processor

TODO _Processor functional requirements_
The reducing Processor is a Processor interface implementation used in
conjunction with another (e.g., basic) Processor to drop labels in a
Metric export pipeline. The default OpenTelemetry SDK SHOULD provide
a Reducing Processor implementation.

### Controller

Expand Down