Change Metric Processor to merge multiple observations #1024

jmacd · 2020-08-03T21:37:36Z

This is a prerequisite to any change that introduces dimensionality reduction. It fixes a bug in handling multiple observations (i.e., measurements from callbacks), that will occur when multiple label sets are reduced (i.e., have key-values removed) into identical label sets with fewer keys than the originals.

Any duplicate metric (i.e., instrument+labelset) from a single Accumulator during a single collection round must be the result of dimensionality reduction; it could also be explained by multiple Accumulators submitting data to a single Processor. In either case, the former behavior is incorrect. The former behavior implements an Accumulator responsibility inside the Processor--it was incorrect behavior.

Resolves #862 by simplifying the relationship between several variables in the processor's state value.

jmacd · 2020-08-04T07:34:09Z

@bogdandrutu please review the approach in #1023 (part of which has been factored into this PR). My goals:

To support a fast / inexpensive way to handle large label sets when we know a fixed set of keys will be aggregated. Putting this functionality in the Accumulator saves work, since the Accumulator builds fewer records this way. This only works when all records that the Processor sees can accept the reduction of dimensionality, which ought to be the common case.
To outline the point where generic metric event sampling may occur, where the Accumulator has access to the excluded labels from filtering. These events (e.g., https://github.com/open-telemetry/opentelemetry-go/pull/1023/files#diff-4c3de179542bac3c5ceacd2090215160R225) can be sampled to output probabilistic raw value exemplars, without requiring high-cardinality labels to pass through the Accumulator.

jmacd · 2020-08-08T03:39:03Z

In the Metrics SIG we discussed two ways that dimensionality reduction can take place, one in the Accumulator and one in a Processor. These two approaches have different performance characteristics and ultimately the best choice will depend on workload. If there is high cardinality in the input, reducing dimensionality will lower the size and cost of maintaining Accumulator state, but the key filter will be applied to every synchronous measurement, which means more filtering cost. If there is not high cardinality in the input, lowering dimensionality after the Accumulator makes more sense, because the filters are applied once per collection interval to every distinct label set instead of once per synchronous event.

When it comes to sampling, these approaches support different sampling schemes, and there are meaningful sampling schemes in both cases, but if we reduce dimensionality in the Accumulator then we are able to sample individual events and if we reduce dimensionality in the Processor then we are able to sample over aggregated measurements.

I want to continue to press forward with this PR because it addresses a problem in both cases and resolves #862.

If reducing dimensionality in the Accumulator: this PR ensures that multiple observations will be combined, not overwrite each other.

If reducing dimensionality in a Processor, this PR assumes that a separate Processor implementation would wrap around this basic Processor implementation. The "reducer" Processor would filter the label set just as the Accumulator would, and again the basic Processor would see multiple observations w/ identical label sets. Again, the correct output is to combine these.

@open-telemetry/specs-metrics-approvers please consider.

jmacd · 2020-08-10T22:03:55Z

@open-telemetry/go-approvers I believe I have demonstrated in two ways that this is a fix for undesirable behavior: #1023 and #1048 both depend on this fix. Please review.

sdk/metric/processor/basic/basic.go

Co-authored-by: Anthony Mirabella <[email protected]>

sdk/metric/processor/basic/basic.go

Co-authored-by: Tyler Yahn <[email protected]>

…y#1024) * Add regexp filter in api/label, test * Add regexp option to sdk.Config * Return indistinct values only when keyRe != nil * Filter in sdk * Add an accumulator filter test * SDK tests pass * Precommit * Undo set filters * Backout related filter changes * Add a new test * Fix build * Apply suggestions from code review Co-authored-by: Anthony Mirabella <[email protected]> * Update comments * Apply suggestions from code review Co-authored-by: Tyler Yahn <[email protected]> Co-authored-by: Anthony Mirabella <[email protected]> Co-authored-by: Tyler Yahn <[email protected]>

jmacd added 10 commits August 2, 2020 00:18

Add regexp filter in api/label, test

2b5e05e

Add regexp option to sdk.Config

f28519c

Return indistinct values only when keyRe != nil

e5c198b

Filter in sdk

00fad70

Add an accumulator filter test

40eaa62

SDK tests pass

ff248f5

Precommit

01720a8

Undo set filters

45746bf

Backout related filter changes

8674206

Add a new test

d8c5809

jmacd added the area:metrics Part of OpenTelemetry Metrics label Aug 3, 2020

jmacd requested review from Aneurysm9, evantorrie, lizthegrey and MrAlias as code owners August 3, 2020 21:37

jmacd mentioned this pull request Aug 5, 2020

Move content length out of basic attributes #1031

Merged

Upstream

8d9dfda

jmacd mentioned this pull request Aug 9, 2020

Add support for filtering metric label sets #1047

Merged

Upstream

e6f2cf0

jmacd mentioned this pull request Aug 9, 2020

Metric label.Set dimensionality reducer Processor #1048

Closed

Upstream

be69113

Fix build

1b2fc09

Aneurysm9 reviewed Aug 11, 2020

View reviewed changes

jmacd and others added 2 commits August 10, 2020 21:46

Apply suggestions from code review

ffcef4d

Co-authored-by: Anthony Mirabella <[email protected]>

Update comments

c89cf0a

MrAlias approved these changes Aug 11, 2020

View reviewed changes

sdk/metric/processor/basic/basic.go Outdated Show resolved Hide resolved

sdk/metric/processor/basic/basic.go Outdated Show resolved Hide resolved

Aneurysm9 approved these changes Aug 11, 2020

View reviewed changes

jmacd and others added 2 commits August 11, 2020 10:16

Apply suggestions from code review

0d83a7d

Co-authored-by: Tyler Yahn <[email protected]>

Merge branch 'master' into jmacd/862

db0571c

jmacd merged commit 3a05cd9 into open-telemetry:master Aug 11, 2020

jmacd deleted the jmacd/862 branch August 11, 2020 17:27

jmacd mentioned this pull request Aug 11, 2020

Prototype metric Views functionality: Export labels configuration #1054

Closed

Aneurysm9 mentioned this pull request Aug 24, 2020

Release v0.11.0 #1089

Merged

pellared added this to the untracked milestone Nov 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Change Metric Processor to merge multiple observations #1024

Change Metric Processor to merge multiple observations #1024

jmacd commented Aug 3, 2020 •

edited

Loading

jmacd commented Aug 4, 2020 •

edited

Loading

jmacd commented Aug 8, 2020

jmacd commented Aug 10, 2020

Change Metric Processor to merge multiple observations #1024

Change Metric Processor to merge multiple observations #1024

Conversation

jmacd commented Aug 3, 2020 • edited Loading

jmacd commented Aug 4, 2020 • edited Loading

jmacd commented Aug 8, 2020

jmacd commented Aug 10, 2020

jmacd commented Aug 3, 2020 •

edited

Loading

jmacd commented Aug 4, 2020 •

edited

Loading