Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specify the treatment of duplicate instruments and observations #2270

Closed
wants to merge 4 commits into from

Conversation

jmacd
Copy link
Contributor

@jmacd jmacd commented Jan 18, 2022

Fixes #2226
Fixes #2249

Changes

This changes the recommendation to return an error for duplicate instrument registration into a case-by-case which permits identically-registered instruments and specifies which fields are considered. Implementations are not required to return the same instances in these cases, only to return valid instruments.

Note that this changes a specified error into a non-error condition. We view this as OK because the semantics are preserved and the user should not be able to detect the difference.

This adds notes to the SDK specification to remind the implementation that despite this flexibility, they are still required to follow the single-writer rule.

Additional info

This was discussed in the 1/18/22 OTel Specification SIG.

@jmacd jmacd requested review from a team January 18, 2022 20:37
@jmacd jmacd assigned bogdandrutu and unassigned SergeyKanzhelev Jan 18, 2022
@jmacd jmacd added spec:metrics Related to the specification/metrics directory release:required-for-ga Must be resolved before GA release, or nice to have before GA labels Jan 18, 2022
Comment on lines +291 to +293
When asynchronous instruments are used with duplicate registrations,
meaning to have more than one callback provided, the order of callback
execution is unspecified. Callers SHOULD avoid making duplicate
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean that registering multiple callbacks in a single async instrument is officially supported? This is related to issue #2249 - it is currently unclear whether this behavior is allowed; and the Java SDK rejects duplicate callbacks and only calls the first one.

Copy link
Contributor Author

@jmacd jmacd Jan 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was my intention, yes. There's a historical perspective on this: OpenCensus permitted duplicate instrument registration and it was part of the early thinking for synchronous instruments. The only reason duplicate-identical instruments ever seemed contentious to me is due to asynchronous instruments. The question is whether users are harmed or helped by allowing this, and I didn't have an answer to this question.

I believe #2249 is suggesting the answer ought to favor allowing duplicate registration and multiple callbacks. I have come to adopt this position as well, but it is entirely driven by thinking this is best for users, i.e., that it creates less confusion. This is independent of the question of whether and how an SDK is required to treat duplicate observations. That is, even without duplicate callbacks the SDK should be doing something to prevent duplicate observations from being double-counted.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe #2249 is suggesting the answer ought to favor allowing duplicate registration and multiple callbacks. I have come to adopt this position as well, but it is entirely driven by thinking this is best for users, i.e., that it creates less confusion.

Yes, I can agree with that -- having recently implemented a bridge from another metrics API to OTel, I can confirm that not being able to register multiple callbacks just forces the instrumentation author to manage that themselves. And that can get unwieldy and hacky.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the "official" way of registering multiple callbacks going to be multiple registration of asynchronous instruments? If multiple callbacks are going to be supported and the user wants them, shouldn't the user be allowed to simply to add more than one callback when creating an asynchronous instrument?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ocelotl I agree this would be awkward. I think it is not meant to be the official way, but is just how multiple registrations should be handled as a specific case. See #2281, specifically "The API MUST provide a way to associate instruments with callback functions responsible for reporting the associated".

floating point numbers SHOULD be considered. Equivalently,
implementations SHOULD NOT support an application in emitting both
integer and floating point numbers for the same metric inside the same
instrumentation library.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

under (belonging to?) the same Meter? See #2276


Different Meters MUST be treated as separate namespaces. The names of the
Instruments under one Meter SHOULD NOT interfere with Instruments under
another Meter.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this compatible with the following phrasing describing meters?

It is unspecified whether or under which conditions the same or different Meter instances are returned from this functions.

If its undefined whether or not a meter provider must return the same or a separate instance of a meter for the same name, then an implementation might end up with two different meters for the same name. This would allow instruments to have the same name, but different type and unit under meters which have the same name.

We could resolve this by adding language that clarified that the identity of the meter is tied to the meter name, similar to the new phrasing here "Instruments are associated with the Meter during creation and are
identified by instrument name".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this mean?

the identity of the meter is tied to the meter name

If 2 or more different meters may have the same name then how can the meter name be used to identify a meter?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If 2 or more different meters may have the same name then how can the meter name be used to identify a meter?

I should be able to obtain a meter by the same name multiple times. And from the meter (or meters - it shouldn't matter whether the same exact instance is returned) I should be able to obtain the same instrument multiple times.

Using an example, the following should work without any errors, and with all calls to add(..) recording against the same metric stream:

meterProvider.get("my-meter").counterBuilder("my-counter").build().add(10);
meterProvider.get("my-meter").counterBuilder("my-counter").build().add(10);

Meter myMeter =  meterProvider.get("my-meter");
myMeter.counterBuilder("my-counter").build().add(10);
myMeter.counterBuilder("my-counter").build().add(10);

identical instrument when a duplicate instrument is registered, but
assuming it does allow separate instances to co-exist, the
implementation MUST eliminate the duplication at a later point using
the [natural merge function](#opentelemetry-protocol-data-model) for
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
the [natural merge function](#opentelemetry-protocol-data-model) for
the [natural merge function](datamodel.md#opentelemetry-protocol-data-model) for

This is in the other file.

implementations MUST NOT violate the single-writer rule even when
users make duplicate observations. This is also covered in the
[supplemental guidelines for handling asynchronous instrument
views](#asynchronous-example-attribute-removal-in-a-view).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
views](#asynchronous-example-attribute-removal-in-a-view).
views](supplementary-guidelines.md#asynchronous-example-attribute-removal-in-a-view).

@@ -952,3 +952,29 @@ called concurrently.

**MetricExporter** - `ForceFlush` and `Shutdown` are safe to be called
concurrently.

## Data model requirements
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You'll need to update the ToC to make the check pass (make markdown-toc).

@jmacd
Copy link
Contributor Author

jmacd commented Jan 25, 2022

From zoom chat during the 1/25/22 Spec SIG:

Jack Berg to Everyone (8:37 AM)

Couldn’t find a good moment to interject, but the duplicate callbacks change actually resolves a recurring outstanding question with the java SDK. Java allows you to obtain the same instrument from the same meter multiple times, provided that the instrument name, type, unit, are compatible. This tracks and makes sense for synchronous instruments, but doesn’t work for async unless multiple callbacks are allowed.

Daniel Dyla (Dynatrace) to Everyone (8:38 AM)

Ditto JavaScript ^

When determining whether a duplicate registration is valid,
language-level features such the distinction between integer and
floating point numbers SHOULD be considered. Equivalently,
implementations SHOULD NOT support an application in emitting both
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps "MUST NOT"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See #2317, major changes.

@@ -239,6 +258,11 @@ instrument. It MUST be treated as an opaque string from the API and SDK.
* It MUST support at least 1023 characters. [OpenTelemetry
API](../overview.md#api) authors MAY decide if they want to support more.

Note the `description` property of an instrument is explicitly
disregarded when considering duplicate registration, because it is not
semantic in nature. Implementations SHOULD use any of the provided
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
semantic in nature. Implementations SHOULD use any of the provided
semantic in nature. Implementations MAY use any of the provided

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've changed this detail in #2317. I think it's fine to be strict about description, as long as we're tolerating duplicate identities.

It is unspecified whether or under which conditions the same or different
instrument instance will be returned as a result of duplicate registration.
* If the registration is semantically not identical, the
implementation MUST return a registration error to the user when the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering if instrument registration is considered part of the initialization. 🤔
I am worried that this might be a bit unwieldy to implement with the requirement not to throw after initialization in this spec.

The API or SDK may fail fast and cause the application to fail on initialization, e.g. because of a bad user config or environment, but MUST NOT cause the application to fail later at run time, e.g. due to dynamic config settings received from the Collector.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instrument registration is not part of initialization because libraries can get loaded after the initialization is done, and the library being loaded could register more instruments.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Following the discussion in #2226, I've updated this part of the proposal in a new PR please see #2317.

<a name="instrument-namespace"></a>

Different Meters MUST be treated as separate namespaces. The names of the
Instruments under one Meter SHOULD NOT interfere with Instruments under
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This recommendation would be better declared as a requirement. There does not seem to be a situation where there is a valid reason to ignore this recommendation and doing so would present compatibility issues. If one OTel implementation allows this interference a user will not be able to implement an application in that language the a same as another because they cannot use the same instruments.

Suggested change
Instruments under one Meter SHOULD NOT interfere with Instruments under
Instruments under one Meter MUST NOT interfere with Instruments under

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in #2317

Comment on lines +297 to +298
set, the caller SHOULD expect the last-observed value to be the one
recorded. This is sometimes referred to as "last-value wins" for
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make the normative statement be a positive one about OTel implementation behavior, not user expectations.

Suggested change
set, the caller SHOULD expect the last-observed value to be the one
recorded. This is sometimes referred to as "last-value wins" for
set, the last-observed value observed SHOULD be the one
recorded. This is sometimes referred to as "last-value wins" for

Also, is there a valid reason to not use the last value observed? Should this be a requirement and not a recommendation? Not using the last value observed would lead to inconsistent behavior across OTel implementations. Meaning having this as a requirement would make sense.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On this note please see #2318. I removed this paragraph in a future iteration, simply stating at this point in the document that the behavior is unspecified. I view this being unspecified is a problem, but I don't want to fix it here.


## Data model requirements

The implementation is required to respect the OpenTelemetry Metrics
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The implementation is required to respect the OpenTelemetry Metrics
The implementation is REQUIRED to respect the OpenTelemetry Metrics

It is unspecified whether or under which conditions the same or different
instrument instance will be returned as a result of duplicate registration.
* If the registration is semantically not identical, the
implementation MUST return a registration error to the user when the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... the implementation MUST return a registration error...

Is the use of "return an error" language-specific? I know that there are languages whose error handling consists of returning errors, but others raise exceptions.

Is the intention of this part of the specification to indicate something must me returned or that the error can be handled in a way that better suits every implementation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See #2317. There's still an error, it's just a question of where it gets surfaced.


* If the registration is semantically identical, meaning to have the
same `kind` and `unit` as a previously registered instrument of the
same `name`, the implementation MUST return a valid instrument.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is a non-valid instrument?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed in #2317

It is unspecified whether or under which conditions the same or different
instrument instance will be returned as a result of duplicate registration.
* If the registration is semantically not identical, the
implementation MUST return a registration error to the user when the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is a registration error?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eliminated in #2317 😀

instrument is constructed.

When determining whether a duplicate registration is valid,
language-level features such the distinction between integer and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
language-level features such the distinction between integer and
language-level features such as the distinction between integer and

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in #2317

Copy link
Contributor

@ocelotl ocelotl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some questions ✌️


* If the registration is semantically identical, meaning to have the
same `kind` and `unit` as a previously registered instrument of the
same `name`, the implementation MUST return a valid instrument.
Copy link
Member

@reyang reyang Jan 29, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"the implementation MUST return a valid instrument" would this be a breaking change given the API spec is already stable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Take this to #2317
I believe we can change a condition formerly specified as error into correct behavior. To me, the error state was effectively unspecified.

@jmacd
Copy link
Contributor Author

jmacd commented Feb 2, 2022

All, I've read through the feedback above and will work to clarify. Note that in #2226 there is now a developing discussion about how to handle duplicate-conflicting instrument registration. I'm willing to adjust the SDK specification to "tolerate" duplicate-conflicting instrument registration by treating instrument kind as part of metric identify for the purposes of the SDK; the specification would be to emit a warning and pass-through streams of both type. Please continue this discussion in #2226 until the objection raised by @anuraaga is settled. Thank you.

@jmacd
Copy link
Contributor Author

jmacd commented Feb 2, 2022

Reviewers, I am closing this as noted #2226 (comment)

The feedback has been well taken and will be addressed. I will RE-OPEN this PR when the new draft is ready.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release:required-for-ga Must be resolved before GA release, or nice to have before GA spec:metrics Related to the specification/metrics directory
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Clarify whether multiple callbacks are allowed for async instruments Meter and instrument identity