Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTTP span status: use SHOULD instead of MUST for errors #1167

Merged
merged 8 commits into from
Jul 9, 2024

Conversation

lmolkova
Copy link
Contributor

@lmolkova lmolkova commented Jun 19, 2024

Fixes #1003

Changes

HTTP status codes are not always sufficient to classify response as an error (or no error).
It depends on the context.

For example:

  • when client checks if resource on the server exists and gets 404 back, it's not an error
  • when server throttles and returns 429 it's neither client nor server error
  • 401 are part of typical authentication flow
  • 418 (I'm a teapot) - ?
  • even codes in 5xx range are controversial. E.g. 501 Not Implemented doesn't necessarily mean a server error.
  • request cancellation on the client side could be an error (timeout) or an optimization if client already got what it needed

HTTP instrumentation libraries don't usually have enough context, but in some cases they do:

  • client library is instrumented and can use the context it has to set proper status
  • instrumentation may provide some way to interact with user application to classify errors in a different way.

Current HTTP semconv REQUIRE to set status to Error under specific conditions, this PR relaxes it and uses SHOULD instead of MUST.

Merge requirement checklist

@lmolkova lmolkova requested review from a team June 19, 2024 19:24
@lmolkova lmolkova requested a review from a team June 19, 2024 19:34
@gregkalapos
Copy link
Member

I expressed my concerns on the issue, adding it here as well for reference.

I think relaxing this part of the spec will open up the door for very different implementations. All the things mentioned in the PR description are valid and I agree the spec could accommodate those things. But none of that information will be part of the spec, instead what the PR does is that it just relaxes the spec without adding any guidelines to the implementations.

I personally would take a different approach and add the specific cases into the spec (from the PR description) and explicitly allow skipping error reporting in those cases. But then I'd still have the existing language with the MUST for the general case, when the HTTP library doesn't have any information regarding the error vs. non-error case.

To take a concrete example: assume there is a genral HTTP library A and another one B - both are general libraries without context about specific errors. Both are instrumented by developers who read the spec. Based on the text proposed in this PR, let's say A decides to never report error for 4xx responses and B decides to always report errors (that's the current expected behaviour). From what I understand, that'd still be compliant to the spec. I fear people will read the spec and they'll jump to different conclusions even for the general case.

Admittedly, based on #1003 it seems I'm in the minority with this opinion. Just wanted to raise it here as well before this gets merged.

@dmathieu
Copy link
Member

Based on the current implementations I have seen, it's often a dedicated HTTP library which creates the spans and marks them as errors.
That library doesn't know what's happening with the request, nor whether a 404 can be expected or not.
In that case, the application would know that.

How about adding a mention inviting libraries to let users define their own error behavior?

@reyang
Copy link
Member

reyang commented Jun 20, 2024

I think relaxing this part of the spec will open up the door for very different implementations. All the things mentioned in the PR description are valid and I agree the spec could accommodate those things. But none of that information will be part of the spec, instead what the PR does is that it just relaxes the spec without adding any guidelines to the implementations.

+1, would this change generate more clarity, or it'll actually make things more confusing?

@lmolkova
Copy link
Contributor Author

How about adding a mention inviting libraries to let users define their own error behavior?

The semantic conventions don't specify how libraries provide configuration or extensibility. With configuration it's being unified, but dynamic enrichment/customization is not (yet). While I want instrumentations to let users customize it, I don't want to encourage it without providing any common mechanism.

@lmolkova
Copy link
Contributor Author

+1, would this change generate more clarity, or it'll actually make things more confusing?

I'm open to specific suggestions, but I don't see how it can be clarified without being confusing. There is still a discussion going on here #1003 (comment)

@lmolkova
Copy link
Contributor Author

lmolkova commented Jun 20, 2024

I added the following note in attempt to provide clarity, LMK if it helps

Note:

The classification of an HTTP status code as an error depends on the context.
For example, a 404 "Not Found" status code indicates an error if the application
expected the resource to be available. However, it is not an error when the
application is simply checking whether the resource exists.

Instrumentations that have additional context about a specific request MAY use
this context to set the span status more precisely.
Instrumentations that don't have any additional context MUST follow the
guidelines in this section.

@lmolkova lmolkova force-pushed the http-error-should branch from 63928c1 to 2d098c4 Compare June 20, 2024 17:37
Copy link
Member

@gregkalapos gregkalapos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The added note addresses my concerns - thanks for that.

Copy link

github-actions bot commented Jul 7, 2024

This PR was marked stale due to lack of activity. It will be closed in 7 days.

@github-actions github-actions bot added the Stale label Jul 7, 2024
@joaopgrassi joaopgrassi removed the Stale label Jul 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

HTTP Client spans - Span Status for 4xx errors - Change from MUST to SHOULD
8 participants