-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Document that baggage is sent to external APIs by automatic instrumetation #3530
Conversation
@swar8080 that's new to me that automatic instrumentation is doing that, can you point me to an example where this is the case? |
@svrnm i've tested this with http requests to localhost URLs with java auto-instrumentation. Also, i'm assuming the auto-instrumentation can't distinguish between Here's a sanity test where i've added baggage to a span making a |
hm... I start to understand what you are trying to say: you may (independent of automatic instrumentation or not) expose the baggage to a service outside of your network (I guess you could safe-guard yourself even in your code example by inspecting |
Yep, and with automatic instrumentation this isn't something you opt-in to - Baggage and trace id is sent in all HTTP requests by default. The above sample code is actually from the java auto-instrumentation repo. This is what it looks like with manual instrumentation: https://opentelemetry.io/docs/instrumentation/java/manual/#context-propagation Feel free to change the documentation wording if you think it could be clearer |
Hrm. I'm a little confused about why this clarification is needed? Autoinstrumentation, by design, propagates this information, which can be exported to another backend. I don't think there's anywhere we claim that baggage is a way to propagate internal-only information. Is it the case that you and/or others may expect baggage wouldn't propagate? FWIW I think the changes are harmless, I'm just worried that others may get an impression about how data propagates that isn't true. |
@cartermp At least for me, the risk wasn't immediately obvious. Only when writing integration tests to verify baggage propagation to an internal service did I realize that third-party APIs would get the data as well. It's implied with the current wording but seems too subtle given the consequences of leaking sensitive data:
That assumes the reader understands context, propagation, and that auto-instrumentation will auto-propagate context.
Is this a concern about the wording or our understanding of how data actually propagates? |
This may be true with your tracing data in general not only baggage, i.e. your service instrumented with OpenTelemetry may send trace context to a service outside your boundaries: open-telemetry/opentelemetry-specification#1633
See my comment above, this is not only true for baggage, so this may be something worth commenting on but the place here is complicated, for @cartermp's worry that others may get an impression about how data propagates that isn't true: a big misconception about baggage is that it will be automatically set as span attributes and send to a backend, we try to clarify this in the document, but this is an issue that persists. The wording "and may be sent to external APIs by automatic instrumentation" will make people think "ah, the data is send to MY external API = observability backend" So, to summarize: I am not against that change, I think we should phrase it differently, or even rethink that whole paragraph:
|
@svrnm got it, so I tried changing the wording so it's clearer that baggage could be sent to unintended resources like third-party APIs. Also, being specific about baggage as an HTTP header hopefully help distinguish it from span attributes |
@swar8080 thanks for the updates. I think this is pretty good. If you comment |
/fix:format |
that didn't work because |
With the current wording, it might not be clear that baggage is sent to external APIs. Calling this out directly seems worth the extra words because of the risk of leaking sensitive data