Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ignore context cancel error in otelgrpc #2644

Open
mostafafarzaneh opened this issue Aug 10, 2022 · 2 comments
Open

Ignore context cancel error in otelgrpc #2644

mostafafarzaneh opened this issue Aug 10, 2022 · 2 comments
Labels
instrumentation: otelgrpc question Further information is requested

Comments

@mostafafarzaneh
Copy link

We usually close gRPC streams by canceling the context. The problem is the otelgrpc fill the error part of the span(context canceled) and the backend(X-Ray in my case) renders it as an error while it is just a simple close.
One may argue that you should close the stream gracefully by sending CloseSend to prevent that. While it is true, it complicates the closing process and I think it is much easier to cancel the context to close the stream.

Is there any way to ignore the context cancel error being reported?

@tommie
Copy link

tommie commented Aug 12, 2022

Re CloseSend, it will make the client code much more complicated, since any layer above the gRPC may cancel the request used for the call. To protect against that, you'd have to invent a new context just for the gRPC, and then "proxy" the cancelation into a CloseSend. You'd also lose all information in that context chain, e.g. loggers and authentication details. I don't think this is viable.

I'm in the same position, where I'd like to ignore [codes](https://pkg.go.dev/google.golang.org/[email protected]/codes).Canceled. However, I think it should only be ignored if the client's context is also canceled (errors.Is(ctx.Err(), context.Canceled)). There may have been something canceled on the server side that caused propagation to the client, and that is useful to differentiate.

Likewise, the server interceptor should ignore codes.Canceled and context.Canceled if the server context is canceled.

Since this logic is not trivial, I'd suggest making it optional to avoid hiding exceptional cases.

In the rest of my code, whenever I see a cancellation, I set a canceled=true attribute, and the span's status is codes.Ok. I'd like something similar for otelgrpc.

tommie added a commit to tommie/opentelemetry-go-contrib that referenced this issue Aug 12, 2022
With this enabled, context.Canceled and gprc_codes.Canceled are
considered OK. A separate attribute is set to true if a cancellation
was detected.

To avoid conflating situations where some downstream server component
issued a cancel of a child context, and that is now propagated over
gRPC, we first check if the parent context is canceled.

It could be argued that status.Convert/FromError should use
errors.Unwrap, since the cancelation that is being returned to a
server interceptor was received from a downstream gRPC client. For
the simplicity of this commit, this is left unchanged from before.

Closes open-telemetry#2644.
@MrAlias MrAlias added question Further information is requested instrumentation: otelgrpc labels Nov 2, 2022
@jmacd
Copy link
Contributor

jmacd commented Jul 19, 2024

This is still a real issue. It is very difficult to arrange a gRPC stream without making use of cancelation. The cancelation support works fine, but the resulting spans should not register as errors.

Related to open-telemetry/otel-arrow#227.

codeboten pushed a commit to open-telemetry/opentelemetry-collector-contrib that referenced this issue Jul 22, 2024
…34175)

**Description:** We have been using the OTel-Arrow components in
production and begun to monitor the component's health using span data.
There are spans coming from otelgrpc instrumentation and there are
explicit spans in both components.

This adds a new test that exercises multiple streams between the
exporter and receiver components. At the end of the test, it checks for
no span errors, with one exception discussed in
open-telemetry/opentelemetry-go-contrib#2644.
We will not modify the logic of these components to avoid stream
cancelation, we will lobby for stream cancelation not to register as
span errors.

The receiver is modified to avoid setting Span errors for EOF and
Cancelation following its existing logic for setting log severity for
the same condition. As a result, the `otel_arrow_stream_inflight` span
will not show errors for the final `Recv()` as a stream shuts down.

**Link to tracking Issue:** Part of
open-telemetry/otel-arrow#227.

**Testing:** A new test is added.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
instrumentation: otelgrpc question Further information is requested
Projects
None yet
Development

No branches or pull requests

4 participants