Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specify the treatment of duplicate instruments and observations #2270

Closed
wants to merge 4 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
50 changes: 42 additions & 8 deletions specification/metrics/api.md
Original file line number Diff line number Diff line change
Expand Up @@ -180,14 +180,33 @@ will have the following information:
* An optional `unit` of measure
* An optional `description`

Instruments are associated with the Meter during creation, and are identified by
the name:

* Meter implementations MUST return an error when multiple Instruments are
registered under the same Meter instance using the same name.
* Different Meters MUST be treated as separate namespaces. The names of the
Instruments under one Meter SHOULD NOT interfere with Instruments under
another Meter.
<a name="duplicate-instrument-handling"></a>

Instruments are associated with the Meter during creation and are
identified by instrument name. Duplicate registration of
identically-named instruments within a Meter is treated as follows:

* If the registration is semantically identical, meaning to have the
same `kind` and `unit` as a previously registered instrument of the
same `name`, the implementation MUST return a valid instrument.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is a non-valid instrument?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed in #2317

Copy link
Member

@reyang reyang Jan 29, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"the implementation MUST return a valid instrument" would this be a breaking change given the API spec is already stable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Take this to #2317
I believe we can change a condition formerly specified as error into correct behavior. To me, the error state was effectively unspecified.

It is unspecified whether or under which conditions the same or different
instrument instance will be returned as a result of duplicate registration.
* If the registration is semantically not identical, the
anuraaga marked this conversation as resolved.
Show resolved Hide resolved
implementation MUST return a registration error to the user when the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering if instrument registration is considered part of the initialization. 🤔
I am worried that this might be a bit unwieldy to implement with the requirement not to throw after initialization in this spec.

The API or SDK may fail fast and cause the application to fail on initialization, e.g. because of a bad user config or environment, but MUST NOT cause the application to fail later at run time, e.g. due to dynamic config settings received from the Collector.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instrument registration is not part of initialization because libraries can get loaded after the initialization is done, and the library being loaded could register more instruments.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Following the discussion in #2226, I've updated this part of the proposal in a new PR please see #2317.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... the implementation MUST return a registration error...

Is the use of "return an error" language-specific? I know that there are languages whose error handling consists of returning errors, but others raise exceptions.

Is the intention of this part of the specification to indicate something must me returned or that the error can be handled in a way that better suits every implementation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See #2317. There's still an error, it's just a question of where it gets surfaced.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is a registration error?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eliminated in #2317 😀

instrument is constructed.

When determining whether a duplicate registration is valid,
language-level features such the distinction between integer and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
language-level features such the distinction between integer and
language-level features such as the distinction between integer and

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in #2317

floating point numbers SHOULD be considered. Equivalently,
implementations SHOULD NOT support an application in emitting both
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps "MUST NOT"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See #2317, major changes.

integer and floating point numbers for the same metric inside the same
instrumentation library.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

under (belonging to?) the same Meter? See #2276


<a name="instrument-namespace"></a>

Different Meters MUST be treated as separate namespaces. The names of the
Instruments under one Meter SHOULD NOT interfere with Instruments under
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This recommendation would be better declared as a requirement. There does not seem to be a situation where there is a valid reason to ignore this recommendation and doing so would present compatibility issues. If one OTel implementation allows this interference a user will not be able to implement an application in that language the a same as another because they cannot use the same instruments.

Suggested change
Instruments under one Meter SHOULD NOT interfere with Instruments under
Instruments under one Meter MUST NOT interfere with Instruments under

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in #2317

another Meter.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this compatible with the following phrasing describing meters?

It is unspecified whether or under which conditions the same or different Meter instances are returned from this functions.

If its undefined whether or not a meter provider must return the same or a separate instance of a meter for the same name, then an implementation might end up with two different meters for the same name. This would allow instruments to have the same name, but different type and unit under meters which have the same name.

We could resolve this by adding language that clarified that the identity of the meter is tied to the meter name, similar to the new phrasing here "Instruments are associated with the Meter during creation and are
identified by instrument name".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this mean?

the identity of the meter is tied to the meter name

If 2 or more different meters may have the same name then how can the meter name be used to identify a meter?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If 2 or more different meters may have the same name then how can the meter name be used to identify a meter?

I should be able to obtain a meter by the same name multiple times. And from the meter (or meters - it shouldn't matter whether the same exact instance is returned) I should be able to obtain the same instrument multiple times.

Using an example, the following should work without any errors, and with all calls to add(..) recording against the same metric stream:

meterProvider.get("my-meter").counterBuilder("my-counter").build().add(10);
meterProvider.get("my-meter").counterBuilder("my-counter").build().add(10);

Meter myMeter =  meterProvider.get("my-meter");
myMeter.counterBuilder("my-counter").build().add(10);
myMeter.counterBuilder("my-counter").build().add(10);


<a name="instrument-naming-rule"></a>

Expand Down Expand Up @@ -239,6 +258,11 @@ instrument. It MUST be treated as an opaque string from the API and SDK.
* It MUST support at least 1023 characters. [OpenTelemetry
API](../overview.md#api) authors MAY decide if they want to support more.

Note the `description` property of an instrument is explicitly
disregarded when considering duplicate registration, because it is not
semantic in nature. Implementations SHOULD use any of the provided
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
semantic in nature. Implementations SHOULD use any of the provided
semantic in nature. Implementations MAY use any of the provided

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've changed this detail in #2317. I think it's fine to be strict about description, as long as we're tolerating duplicate identities.

`description` values when emitting metrics that had duplicate registrations.

Instruments can be categorized based on whether they are synchronous or
asynchronous:

Expand All @@ -264,6 +288,16 @@ Please note that the term _synchronous_ and _asynchronous_ have nothing to do
with the [asynchronous
pattern](https://en.wikipedia.org/wiki/Asynchronous_method_invocation).

When asynchronous instruments are used with duplicate registrations,
meaning to have more than one callback provided, the order of callback
execution is unspecified. Callers SHOULD avoid making duplicate
Comment on lines +291 to +293
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean that registering multiple callbacks in a single async instrument is officially supported? This is related to issue #2249 - it is currently unclear whether this behavior is allowed; and the Java SDK rejects duplicate callbacks and only calls the first one.

Copy link
Contributor Author

@jmacd jmacd Jan 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was my intention, yes. There's a historical perspective on this: OpenCensus permitted duplicate instrument registration and it was part of the early thinking for synchronous instruments. The only reason duplicate-identical instruments ever seemed contentious to me is due to asynchronous instruments. The question is whether users are harmed or helped by allowing this, and I didn't have an answer to this question.

I believe #2249 is suggesting the answer ought to favor allowing duplicate registration and multiple callbacks. I have come to adopt this position as well, but it is entirely driven by thinking this is best for users, i.e., that it creates less confusion. This is independent of the question of whether and how an SDK is required to treat duplicate observations. That is, even without duplicate callbacks the SDK should be doing something to prevent duplicate observations from being double-counted.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe #2249 is suggesting the answer ought to favor allowing duplicate registration and multiple callbacks. I have come to adopt this position as well, but it is entirely driven by thinking this is best for users, i.e., that it creates less confusion.

Yes, I can agree with that -- having recently implemented a bridge from another metrics API to OTel, I can confirm that not being able to register multiple callbacks just forces the instrumentation author to manage that themselves. And that can get unwieldy and hacky.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the "official" way of registering multiple callbacks going to be multiple registration of asynchronous instruments? If multiple callbacks are going to be supported and the user wants them, shouldn't the user be allowed to simply to add more than one callback when creating an asynchronous instrument?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ocelotl I agree this would be awkward. I think it is not meant to be the official way, but is just how multiple registrations should be handled as a specific case. See #2281, specifically "The API MUST provide a way to associate instruments with callback functions responsible for reporting the associated".

observations from asynchronous instrument callbacks. However, if a
user accidentally makes duplicate observations, meaning to provide
more than one value for a given asynchronous instrument and attribute
set, the caller SHOULD expect the last-observed value to be the one
recorded. This is sometimes referred to as "last-value wins" for
Comment on lines +297 to +298
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make the normative statement be a positive one about OTel implementation behavior, not user expectations.

Suggested change
set, the caller SHOULD expect the last-observed value to be the one
recorded. This is sometimes referred to as "last-value wins" for
set, the last-observed value observed SHOULD be the one
recorded. This is sometimes referred to as "last-value wins" for

Also, is there a valid reason to not use the last value observed? Should this be a requirement and not a recommendation? Not using the last value observed would lead to inconsistent behavior across OTel implementations. Meaning having this as a requirement would make sense.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On this note please see #2318. I removed this paragraph in a future iteration, simply stating at this point in the document that the behavior is unspecified. I view this being unspecified is a problem, but I don't want to fix it here.

asynchronous instruments.

### Counter

`Counter` is a [synchronous Instrument](#synchronous-instrument) which supports
Expand Down
26 changes: 26 additions & 0 deletions specification/metrics/sdk.md
Original file line number Diff line number Diff line change
Expand Up @@ -952,3 +952,29 @@ called concurrently.

**MetricExporter** - `ForceFlush` and `Shutdown` are safe to be called
concurrently.

## Data model requirements
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You'll need to update the ToC to make the check pass (make markdown-toc).


The implementation is required to respect the OpenTelemetry Metrics
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The implementation is required to respect the OpenTelemetry Metrics
The implementation is REQUIRED to respect the OpenTelemetry Metrics

data model [Single Writer](datamodel.md#single-writer) requirement.
This rule stipulates that the implementation MUST avoid creating
duplicate output streams from a given SDK.

This rule is the basis of the output-name uniqueness check in for
[Views](#view) above, and it also constrains how duplicate instrument
registration is handled.

For example, the implementation is not required to return the
identical instrument when a duplicate instrument is registered, but
assuming it does allow separate instances to co-exist, the
implementation MUST eliminate the duplication at a later point using
the [natural merge function](#opentelemetry-protocol-data-model) for
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
the [natural merge function](#opentelemetry-protocol-data-model) for
the [natural merge function](datamodel.md#opentelemetry-protocol-data-model) for

This is in the other file.

those data points, as otherwise it would risk violating the
single-writer rule.

As another example, users are encouraged not to make duplicate
observations from asynchronous instrument callbacks. However,
implementations MUST NOT violate the single-writer rule even when
users make duplicate observations. This is also covered in the
[supplemental guidelines for handling asynchronous instrument
views](#asynchronous-example-attribute-removal-in-a-view).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
views](#asynchronous-example-attribute-removal-in-a-view).
views](supplementary-guidelines.md#asynchronous-example-attribute-removal-in-a-view).