Update 0003-measure-metric-type to match current spec #61

jmacd · 2019-10-25T06:12:03Z

There were a few changes made to the current metrics spec that should be updated here, since this OTEP was never set to approved.

The significant changes in this text to match the current spec are:

Instruments are allocated via the Meter
Use of Monotonic(true/false) and Absolute(true/false)
RecordBatch accepts all instrument kinds
RecordBatch accepts unbound instruments (i.e., not handles)

I believe this puts the OTEP in line with the spec, there is nothing else new here.

SergeyKanzhelev · 2019-10-26T02:30:43Z

I'm not sure what good it makes to update OTEPS. Do you see an OTEP as a source of valuable information beyond the fact that feature was approved fro implementation? In my mind OTEPs are proposal/discussion of a features and actual spec is in specs repository. So names mismatch is totally normal situation.

I'm happy to approve this PR if needed

jmacd · 2019-10-28T17:22:21Z

@bogdandrutu asked me to update the OTEPs.

text/0009-metric-handles.md

bogdandrutu · 2019-11-01T17:26:01Z

text/0003-measure-metric-type.md


 | Field | Description |
 |------|-----------|
 | Name | A string. |
 | Kind | One of Cumulative, Gauge, or Measure. |
-| Required Keys | List of always-defined keys in handles for this metric. |
+| Recommended Keys | Default aggregation keys. |


I need to better understand the difference between recommended vs required. It feels to me that the whole metrics derived a bit more than expected with the LabelSet and got pushed more towards statsd than I feel comfortable (this is mostly coming from the Go implementation that I looked at).

Allowing users to call getHandlers and add/set/record with LabelSet that are larger than the "required keys" removes the optimization that this operation can be transformed in a simple atomic increment for the most common case (default aggregation using recommended/required keys).

As an example:

I want a counter with "recommended keys" -> queue_topic (which has 3 possible values value_1, value_2, value_3).

Let's assume the user want to build the metric with only the "recommended keys" - which I expect to be the common case.

If I call add with a LabelSet that includes queue_topic and queue_name then I need to do expensive operation to extract the queue_topic from the LabelSet then look for the Handle then update the atomic variable (even if the LabelSet includes only the queue_topic I need to look for the value and compare strings). This will definitely be extra overhead for the common case.

I would like to confirm as I mentioned before that a Prometheus like performance can be achieved for the "required/recommended keys" and default aggregation with our current API, while allow more extensibility and custom aggregations.

This comment probably applies to the whole LabelSet concept. I feel that the current API spec is focused on the fact that the entire LabelSet will just be serialized on the wire, which is different than what I think an optimized metrics implementation should do which is to pre-aggregate metrics and drop all the extra labels that are not necessary.

Anyway don't want to block this PR with this comment, but want to have this documented and addressed before we move forward with the LabelSet idea (which I do like from the perspective of having an implementation performant for statsd, but I do not like that adds unnecessary overhead for something like Prometheus).

If I call add with a LabelSet that includes queue_topic and queue_name then I need to do expensive operation to extract the queue_topic from the LabelSet then look for the Handle then update the atomic variable (even if the LabelSet includes only the queue_topic I need to look for the value and compare strings). This will definitely be extra overhead for the common case.

If you add/set/record with a handle that has more labels than the recommended labels for the instrument in question, you still get an atomic operation (assuming the aggregation supports it--some measure-aggregations do not). The atomic operation updates a record that is bound to the pair of (descriptor, labelset). There is no additional computation in the instrumentation code path.

On the collect/export code path, the implementation "groups" all the records for a particular descriptor, meaning to fold results from several label sets into one aggregate result. So you do have additional cost, but it is not in the fast path, it's in the export path. There's a possibility to compute and cache a mapping from (Descriptor, LabelSet) to ordinal recommended-key positions, that would mostly alleviate the computation here. It's a simple matter of running through the label sets, mapping into the unique, ordered recommended-key labels, and then merging aggregates. [I haven't implemented this cache, it's a NOTE in the code.]

Now consider the benefit. You suggested a LabelSet that includes both queue_topic (the recommended key) and queue_name (an additional label). During the collect/export pass, the implementation will merge records from different label sets according to their value for queue_topic, but we have new information that could be very useful to the user. We're able to derive, for each of the aggregate results, the distribution of queue_name. This could be used to automatically ensure that a unique exemplar was supplied for each distinct queue_name, for example. This is a huge benefit, IMO.

Note that the cost of a handle Add() in the Go benchmarks stands at ~15ns (regardless of how many labels there are, compared with recommended labels), where the direct call to Add() with a label set stands at ~200ns. So using a handle is an order of magnitude cheaper than a direct call (also note that a direct call is much cheaper than constructing a label set, for realistic label sets).

It remains to be seen how much the impact of additional CPU at collection will be, but we cannot let the future metrics world be constrained by Prometheus which has a very limited view of how we can use labels.

Can I resolve this?

I need to see an implementation (e2e) before convincing myself that we are not making a wrong API decision. But I think we should close for the moment, we documented the concern and will revisit once we have an e2e implementation.

Maybe just open an issue with the conversation to track this.

Here's an e2e implementation. There are two export.Batcher implementations, one groups by recommended keys, one groups by all keys in the label set.
open-telemetry/opentelemetry-go#265

@bogdandrutu I'm not sure I could summarize the issue you intend to track. Is it that LabelSets permit additional labels and you're worried about performance?

tigrannajaryan · 2019-11-08T14:20:06Z

I'm not sure what good it makes to update OTEPS. Do you see an OTEP as a source of valuable information beyond the fact that feature was approved fro implementation? In my mind OTEPs are proposal/discussion of a features and actual spec is in specs repository. So names mismatch is totally normal situation.

I'm happy to approve this PR if needed

My 2 cents on this:

Sometimes OTEPs contain comprehensive and summarized information information in a form that does not exist in specification. Applying OTEPs to the spec sometimes results in modifications in multiple places in multiple files using many PRs and the big picture is not visible that way. For such cases I believe it is value to have the OTEP as one document and even reference to it where appropriate from spec so that a particular small detail in an obscure corner of the spec becomes clearer. This is for example the case with my OTLP proposal where protocol elements will be scattered among many .proto files plus an explanation will be added in a separate doc in the proto repository (and the big picture will be more difficult to see).

For some other OTEPs the entire contents of the OTEP is verbatim copied to the spec, in which case I agree OTEP should be considered of historical interest only and probably even marked so somehow with a note saying that the source of truth is now the spec. Updating OTEP in such cases is I think unnecessary.

I am not sure which of these cases applies to this particular OTEP, I haven't searched the spec thoroughly to see if the entire content of this OTEP also exists in the spec.

bogdandrutu · 2019-11-16T19:14:00Z

This PR was opened for a long time and it has 3 official approver LGTM + couple others coming from people interested in metrics. Also this PR just clarifies some changes we agreed in the specs.

Entering God mode and merge this.

) * Propose approval / start round 3 of discussion * Use Monotonic * Updates to use Monotonic/NonMonotonic, and NonNegtive/Signed * Update 0003 * Minor * Revert * Revert

…/oteps#61) * Propose approval / start round 3 of discussion * Use Monotonic * Updates to use Monotonic/NonMonotonic, and NonNegtive/Signed * Update 0003 * Minor * Revert * Revert

jmacd and others added 6 commits September 16, 2019 12:48

Propose approval / start round 3 of discussion

cb6a996

Use Monotonic

06f9656

Updates to use Monotonic/NonMonotonic, and NonNegtive/Signed

4f59e21

Merge branch 'master' into jmacd/metric-round3

d871653

Upstream

202cf54

Update 0003

2841cfd

jmacd requested review from AloisReitbauer, bogdandrutu, c24t, carlosalberto, iredelmeier, reyang, SergeyKanzhelev, songy23, tedsuo, tigrannajaryan and yurishkuro as code owners October 25, 2019 06:12

Minor

8b13a7f

bogdandrutu reviewed Oct 28, 2019

View reviewed changes

text/0009-metric-handles.md Outdated Show resolved Hide resolved

jmacd added 2 commits October 28, 2019 10:47

Revert

5f07994

Revert

fce5747

lzchen approved these changes Oct 28, 2019

View reviewed changes

bogdandrutu approved these changes Nov 1, 2019

View reviewed changes

SergeyKanzhelev approved these changes Nov 1, 2019

View reviewed changes

Merge branch 'master' into jmacd/metric-round3

08a8ab1

reyang approved these changes Nov 11, 2019

View reviewed changes

jmacd mentioned this pull request Nov 12, 2019

Metric SDK specification OUTLINE open-telemetry/opentelemetry-specification#347

Merged

jmacd mentioned this pull request Nov 13, 2019

[api-metrics] Mention of 'context' w/o definition. open-telemetry/opentelemetry-specification#352

Closed

Merge branch 'master' into jmacd/metric-round3

d4e097b

bogdandrutu merged commit 1988df1 into open-telemetry:master Nov 16, 2019

jmacd deleted the jmacd/metric-round3 branch May 6, 2020 18:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update 0003-measure-metric-type to match current spec #61

Update 0003-measure-metric-type to match current spec #61

jmacd commented Oct 25, 2019

SergeyKanzhelev commented Oct 26, 2019

jmacd commented Oct 28, 2019

bogdandrutu Nov 1, 2019

jmacd Nov 1, 2019

jmacd Nov 1, 2019 •

edited

Loading

jmacd Nov 6, 2019

bogdandrutu Nov 7, 2019

bogdandrutu Nov 7, 2019

jmacd Nov 7, 2019

jmacd Nov 7, 2019

tigrannajaryan commented Nov 8, 2019

bogdandrutu commented Nov 16, 2019

Update 0003-measure-metric-type to match current spec #61

Update 0003-measure-metric-type to match current spec #61

Conversation

jmacd commented Oct 25, 2019

SergeyKanzhelev commented Oct 26, 2019

jmacd commented Oct 28, 2019

bogdandrutu Nov 1, 2019

Choose a reason for hiding this comment

jmacd Nov 1, 2019

Choose a reason for hiding this comment

jmacd Nov 1, 2019 • edited Loading

Choose a reason for hiding this comment

jmacd Nov 6, 2019

Choose a reason for hiding this comment

bogdandrutu Nov 7, 2019

Choose a reason for hiding this comment

bogdandrutu Nov 7, 2019

Choose a reason for hiding this comment

jmacd Nov 7, 2019

Choose a reason for hiding this comment

jmacd Nov 7, 2019

Choose a reason for hiding this comment

tigrannajaryan commented Nov 8, 2019

bogdandrutu commented Nov 16, 2019

jmacd Nov 1, 2019 •

edited

Loading