Clarification required for Span.setStatus #1584

iNikem · 2021-03-29T12:40:46Z

One specific situation is currently absent from https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/api.md#set-status. Namely, imagine we have a web application running with Otel auto-instrumentation. Instrumentation creates a SERVER span for an incoming request. During this request, some exception is thrown, which sets response code to 400. But framework provides an exception callback, during which application developer retrieves the SERVER span from the currently active context and sets its status to OK, based on some application logic.

Now, when auto-instrumentation is going to end this SERVER span, then based on http semantic convention it MUST set spans status to ERROR. And spec referenced above allows it to do so. But I think the intention was to allow application developer or operator to override any automatic decision by manually setting span status to OK. Should this mean that auto-instrumentation MUST NOT override span status OK? Please note also, that currently there is no way to read span status from Otel API.

The text was updated successfully, but these errors were encountered:

carlosalberto · 2021-03-29T14:46:31Z

But I think the intention was to allow application developer or operator to override any automatic decision by manually setting span status to OK.

That's correct, based on my understanding. @tedsuo ?

lkrzyzanek · 2021-03-31T08:26:03Z

It's quite common practice to use 400 (bad request) on REST API to indicate that client provided wrong input parameters.

We use it for any kind of validation of input parameters including also some business logic like to find appropriate records in DB. That's why it's quite important to have possibility to indicate that 400 is not an error.

pyohannes · 2021-03-31T22:59:49Z

I am confused by this too. I have an use case where HTTP auto-instrumentation returns a 404, but the application can handle that gracefully (by hitting a fallback endpoint), or in some cases even expects a 404.

I was wondering about this sentence in the spec:

Analysis tools SHOULD respond to an Ok status by suppressing any errors they would otherwise generate. For example, to suppress noisy errors such as 404s.

To me it's not clear whether that sentence refers to the HTTP span itself, to its parent span, or to any span in the trace.

So, given a trace with two spans:

Span 1: Unset, Span 2: Error. The backend/analysis tool should treat this as an error.
Span 1: OK, Span 2: Error. The backend/analysis tool should not treat this as an error, but suppress the error status on Span 2?

I also posted a related question on slack.

reyang · 2021-04-02T15:39:59Z

Discussed during the Friday issue triage meeting, assigned this to @carlosalberto, it might take some time since this seems to be a challenging topic.

Oberon00 · 2021-04-06T12:02:30Z

Please also consider the original OTEP https://github.com/open-telemetry/oteps/blob/main/text/trace/0136-error_flagging.md which has suggestions on how to handle this with the Status Source attribute of the status.

iNikem · 2021-04-06T16:47:49Z

If Span's status were readable attribute then we could have a very simple solution: "if span has OK status set, then auto-instrumentations MUST NOT override it". That's it, isn't it?

Oberon00 · 2021-04-07T15:36:41Z

How would one put the OK status in the same span that the auto-instrumentation would set the status of?

iNikem · 2021-04-07T16:09:29Z

How would one put the OK status in the same span that the auto-instrumentation would set the status of?

My original example describes exactly this situation. Auto-instrumentation sets the status of the span, which corresponds to the incoming http request, after/during the response is being sent out. Application developer can grab the current span in their code while processing that request and set span's status. Way before auto-instrumentation attempts to do so.

Oberon00 · 2021-04-07T16:28:11Z

Sorry for not reading properly 🙈

I think this example is perfectly solved in the original, accepted but only partially spec'd out OTEP: The user would call SetStatus(Status.OK, StatusSource.USER) and the instrumenation would then later call SetStatus(Status.ERROR, StatusSource.INSTRUMENTATION). Then the backend should normally prefer the USER status "as it is a communication from the application developer or operator and contains valuable information".

CC @tedsuo

iNikem · 2021-04-07T17:19:28Z

Status source will work as well, certainly. I still think that my proposal is easier :)

Oberon00 · 2021-04-08T08:14:00Z

Reading the status would violate the design principle that spans are write-only (except for SpanContext which is immutable, fixed from creation, and required for propagation). Also, every instrumentation would need to check the status and if it forgets, the user-status is overwritten.

A solution somewhere in the middle would be to provide a SetStatusIfUnset API.

iNikem · 2021-04-08T08:45:53Z

All three solutions work for me :)

carlosalberto · 2021-04-13T14:00:28Z

Status source will work as well, certainly. I still think that my proposal is easier :)

I also like your solution more, but I'm afraid that providing a getter for Status may be hard (impossible), e.g. a streaming implementation (besides what @Oberon00 already mentioned regarding the write-only design).

I'd be up for adding the Instrumentation part to if languages can offer this without breaking their APIs, which I'm not totally sure it's feasible (might need to ask maintainers).

SetStatusIfUnset looks not as clean but it might be a good trade off definitely. I will bring this up in our SIG call today ;)

tedsuo · 2021-04-13T16:00:12Z

From the spec meeting today:

We discussed the options. There is general agreement that adding state to the API (via a status getter) is not a feasible solution.

We discussed adding a StatusSource field with two states, USER and INSTRUMENTATION. The USER status always wins. In practice, we add this feature by changing the current SetStatus function to set the source as USER, to avoid breaking user code. An additional SetStatus method or option would be added to set INSTRUMENTATION as the source.

Adding StatusSource has been approved as a solution to this problem, but we are reluctant to add this complexity and extra state until it is proven that we need it.

As a simpler solution, we are proposing that we add a SetStatus option to End. This status option would only be applied when the status is currently UNSET. All instrumentation would then be required to use this option. This allows the user to override instrumentation when setting the status. This incremental approach would be compatible with adding StatusSource at a later date, if we determine that we need it.

@iNikem volunteered to investigate whether adding a SetStatus option on End is a feasible solution.

Investigate whether the current Java instrumentation could meet the requirement of only setting the status when the span ends.
Investigate whether exception handling works with this solution.

It would be helpful to check other languages as well, and confirm that this solution is feasible before we try it. Please chime in if you, dear reader, can spend some time this month investigating this issue.

pyohannes · 2021-04-13T22:59:04Z

@tedsuo I have use cases where StatusSource would help, but the simpler solution (overriding status on End) wouldn't.

One use case is overriding error status of HTTP client instrumentations. With several HTTP client instrumentations I worked with there is no way to access the span that is created by the instrumentation. Furthermore there might be scenarios where application developers use libraries which in turn use instrumented HTTP clients - it would be very hard for them to directly access those spans across libraries.

StatusSource provides a way to override status without having direct access to spans created by instrumentations, and this is a need users have already expressed to me.

iNikem · 2021-04-14T04:50:11Z

StatusSource provides a way to override status without having direct access to spans created by instrumentations

@pyohannes Can you elaborate on this, please? StatutSource still needs to be passed to a span, how will it help you if you don't access to spans?

pyohannes · 2021-04-14T23:38:00Z

Can you elaborate on this, please? StatutSource still needs to be passed to a span, how will it help you if you don't access to spans?

@iNikem On backend or exporter side (maybe triggered by configuration) one could say: ignore error status values when StatusSource is INSTRUMENTATION, but honor the error status when the StatusSource is USER.

That is how I interpret this sentence in the OTEP:

Analysis tools MAY disregard status codes, in favor of their own approach to error analysis. However, it is strongly suggested that analysis tools SHOULD pay attention to the status codes when set by USER, as it is a communication from the application developer or operator and contains valuable information.

Then any instrumentation might set error status codes, but the backend configured in that way will only mark traces as failed when there's an error status code where StatusSource=USER.

Not the most elegant. But I think it can get very tedious for application developers to re-set status codes of instrumentations.

iNikem · 2021-04-15T06:45:32Z

If backend "ignore error status values when StatusSource is INSTRUMENTATION", then it means all application developer has to set status code whenever they expect the span to be an error. This is even more work than re-setting status when there is a conflict between developer expectation and auto-instrumentation decision.

pyohannes · 2021-04-15T17:46:03Z

If backend "ignore error status values when StatusSource is INSTRUMENTATION", then it means all application developer has to set status code whenever they expect the span to be an error.

The backend could "ignore error status values when StatusSource is INSTRUMENTATION only if there is a span in the trace with StatusSource=USER and Status=OK".

However, another possibility without the need for StatusSource would be to "ignore error status values if there is a span in the trace with Status=OK". I wonder if this section of the current specification could be interpreted in this way; I tried to get clarification on this, with no success yet:

Analysis tools SHOULD respond to an Ok status by suppressing any errors they would otherwise generate. For example, to suppress noisy errors such as 404s.

@iNikem Do you think a solution in those lines could solve your problem? Otherwise I'll open another issue to not mix up different problems and requirements.

iNikem · 2021-04-15T19:03:00Z

I think that ignoring status of one span based on the status of another span is a VERY special case and certainly is not something that Otel specification can recommend.

iNikem · 2021-05-04T13:11:03Z

As a simpler solution, we are proposing that we add a SetStatus option to End. This status option would only be applied when the status is currently UNSET. All instrumentation would then be required to use this option. This allows the user to override instrumentation when setting the status. This incremental approach would be compatible with adding StatusSource at a later date, if we determine that we need it.

@iNikem volunteered to investigate whether adding a SetStatus option on End is a feasible solution.

Investigate whether the current Java instrumentation could meet the requirement of only setting the status when the span ends.

Investigate whether exception handling works with this solution.

I have a POC changes as described above in Otel Java API/SDK and Java instrumentation repo. They demonstrate that the approach above can work. Only one or two instrumentations out of many in Java instrumentation repo may require somewhat elaborate changes, in the vast majority of cases changes are trivial.

During implementation and following discussion with @jkwatson we found that:

Having Span.end(Status) in API may confuse application developers into thinking that they should use this method to end a span, instead of Span.end() which should be used in the majority of cases.
A simpler approach may be to change Span.setStatus method behaviour to "freeze" status if it was called with OK and ignore any later calls to setStatus.

In both cases the behaviour of API methods may be surprising to the end-users, because in some cases provided Status will be ignored. In both cases we have to rely on documentation to clear confusion.

carlosalberto · 2021-05-04T17:14:51Z

From the SIG meeting: a take on option B (freezing Span.setStatus) will be done in the DotNet SIG, and after that the change will be proposed against the Specification.

iNikem · 2021-05-05T09:00:29Z

For the record, my POC of option A above:
API/SDK
Instrumentations

reyang · 2021-05-06T07:12:54Z

.NET PoC of Option B is here.
We've discussed during the OpenTelemetry .NET SIG meeting and folks are good with this approach.

jkwatson · 2021-05-07T21:54:16Z

I'm wondering, if we go with option B and the SDK is now going to start enforcing these ideas, should the SDK also ignore any explicit call to set the status to UNSET?

iNikem added the spec:trace Related to the specification/trace directory label Mar 29, 2021

reyang assigned reyang and carlosalberto and unassigned reyang Apr 2, 2021

Oberon00 added area:api Cross language API specification issue area:error-reporting Related to error reporting labels Apr 7, 2021

reyang mentioned this issue May 4, 2021

POC - ignore SetStatus is the status is already set to Ok open-telemetry/opentelemetry-dotnet#2032

Closed

3 tasks

This was referenced May 8, 2021

Servlet instrumentation overwrites setStatus that was set manually earlier open-telemetry/opentelemetry-java-instrumentation#2929

Closed

Add clarifications to SetStatus #1685

Merged

SergeyKanzhelev closed this as completed in #1685 Jun 8, 2021

BenHewins mentioned this issue Jun 12, 2023

SetStatus will overwrite OK with Error, which it should not do according to the spec open-telemetry/opentelemetry-dotnet#4582

Closed

svrnm mentioned this issue Dec 14, 2023

Specification & OTEP relationship: Provide references to implemented OTEPs #3792

Open

46 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clarification required for Span.setStatus #1584

Clarification required for Span.setStatus #1584

iNikem commented Mar 29, 2021

carlosalberto commented Mar 29, 2021

lkrzyzanek commented Mar 31, 2021

pyohannes commented Mar 31, 2021

reyang commented Apr 2, 2021

Oberon00 commented Apr 6, 2021

iNikem commented Apr 6, 2021 •

edited

Loading

Oberon00 commented Apr 7, 2021

iNikem commented Apr 7, 2021

Oberon00 commented Apr 7, 2021 •

edited

Loading

iNikem commented Apr 7, 2021

Oberon00 commented Apr 8, 2021

iNikem commented Apr 8, 2021

carlosalberto commented Apr 13, 2021

tedsuo commented Apr 13, 2021 •

edited

Loading

pyohannes commented Apr 13, 2021

iNikem commented Apr 14, 2021

pyohannes commented Apr 14, 2021

iNikem commented Apr 15, 2021

pyohannes commented Apr 15, 2021

iNikem commented Apr 15, 2021

iNikem commented May 4, 2021

carlosalberto commented May 4, 2021 •

edited

Loading

iNikem commented May 5, 2021

reyang commented May 6, 2021

jkwatson commented May 7, 2021

Clarification required for Span.setStatus #1584

Clarification required for Span.setStatus #1584

Comments

iNikem commented Mar 29, 2021

carlosalberto commented Mar 29, 2021

lkrzyzanek commented Mar 31, 2021

pyohannes commented Mar 31, 2021

reyang commented Apr 2, 2021

Oberon00 commented Apr 6, 2021

iNikem commented Apr 6, 2021 • edited Loading

Oberon00 commented Apr 7, 2021

iNikem commented Apr 7, 2021

Oberon00 commented Apr 7, 2021 • edited Loading

iNikem commented Apr 7, 2021

Oberon00 commented Apr 8, 2021

iNikem commented Apr 8, 2021

carlosalberto commented Apr 13, 2021

tedsuo commented Apr 13, 2021 • edited Loading

pyohannes commented Apr 13, 2021

iNikem commented Apr 14, 2021

pyohannes commented Apr 14, 2021

iNikem commented Apr 15, 2021

pyohannes commented Apr 15, 2021

iNikem commented Apr 15, 2021

iNikem commented May 4, 2021

carlosalberto commented May 4, 2021 • edited Loading

iNikem commented May 5, 2021

reyang commented May 6, 2021

jkwatson commented May 7, 2021

iNikem commented Apr 6, 2021 •

edited

Loading

Oberon00 commented Apr 7, 2021 •

edited

Loading

tedsuo commented Apr 13, 2021 •

edited

Loading

carlosalberto commented May 4, 2021 •

edited

Loading