-
Notifications
You must be signed in to change notification settings - Fork 891
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AWS Lambda spec inappropriately prioritizes x-ray propagation #3060
Comments
This is a non-starter from my perspective as it inappropriately conflates the propagator interface (extract context from/inject context to a carrier) with the instrumentation scope of responsibility (determine what to use as the carrier for a propagator and what to do with the result). The choice to prefer an environment variable over the provided carrier is only relevant to this particular instrumentation and the general-purpose X-Ray propagator should not be burdened with knowledge of its existence.
I think these are all variations on the same theme and overindex on the propagator as the locus of the problem. In addition to suffering from the above-mentioned interface conflation, each brings additional complexity and adds potential for confusion in selecting which propagator to use in which circumstances. This is not something, I don't think, that needs to be made the user's responsibility.
I worry that this may lead to confusion for users if it is not explicitly opt-in. It could result in the context being extracted from the environment when the user expected it to be extracted from the event. I think a minor tweak to this would result in the All of the options discussed to this point have looked at extracting a single trace context to use as the parent context for new spans created by the instrumentation. An alternative we should consider is giving an option to use the trace context extracted from the Lambda execution environment 1) as the parent context for new spans, 2) as a link added to new spans with a parent context extracted from the incoming event, or 3) not at all. |
I don't understand how a propagator that returns a composite parent context would work with the existing API design. Perhaps you could elaborate on how you expect this to function?
I still think doing this inside the propagator system is the best option. The only other option not listed above is to formalize what some languages have already done and add a setting to enable/disable prioritizing the environment variable, but I don't love that and if we did add it I would say the default should be disabled. |
There is no such thing as a "composite parent context". Trace contexts are scalar values. Which part of my response did you feel was suggesting a composite context? I may have been unclear with referents.
That is correct, and that's because this isn't a propagator issue but an instrumentation issue where the instrumentation needs to know how to provide an appropriate carrier to the configured propagator such that the desired behavior is achieved. The Imagine, for instance, an func eventToCarrier(eventJSON []byte) propagation.TextMapCarrier {
res := map[string]any{}
err := json.Unmarshal(eventJSON, &res)
if err != nil {
return map[string]any{}
}
if xrayctx, ok := os.LookupEnv("_X_AMZN_TRACE_ID"); ok {
res["_X_AMZN_TRACE_ID"] = xrayctx
}
return res
} If I use this with the X-Ray propagator then it will discover the More interesting would be a composite that could augment any func CompositeEventToCarrier(base EventToCarrier) EventToCarrier {
return func(eventJSON []byte) propagation.TextMapCarrier {
res := base(eventJSON)
if xrayctx, ok := os.LookupEnv("_X_AMZN_TRACE_ID"); ok {
res.Set("_X_AMZN_TRACE_ID", xrayctx)
}
return res
}
} Then, as a user with a custom event structure that I have my own handler := otellambda.Wrap(origHandler, otellambda.WithEventToCarrier(otellambda.CompositeEventToCarrier(myE2C)))
I think you think that the "propagator system" is more than it actually is and we're talking past each other. The propagators are, at heart, nothing more than a pair of pure functions: Propagators are able to function effectively across context encoding types and carrier types because they are so simple and depend on simple interfaces. Anything more complicated, such as "which carrier should be given to a propagator" is a concern for the instrumentation. HTTP instrumentation can know to make a carrier from the HTTP headers. Kafka instrumentation can know to make a carrier from Kafka message headers. gRPC instrumentation can know to create a carrier from message metadata. And the Lambda instrumentation can know to create a carrier from the Lambda execution environment and incoming event. Doing so will allow the instrumentation to function effectively without regard to which propagators are in use. Where the current spec goes wrong is in requiring that the X-Ray propagator be used if the |
I don't necessarily agree with that, but could see value in an environment variable to disable X-Ray propagation even if An additional option, something I've been mulling a bit with respect to non-lambda issues similar to this, is a configuration option to say that the Trace from X-Ray should be used as a Link instead of a parent. |
@Aneurysm9 sorry, instead of saying In the Java API, the extract method looks like this:
The instrumentation chooses which carrier to provide (for the case of lambda instrumentation, it's the Instead of (One issue I noticed in your proposal for @tsloughter I like that idea but I would suggest it should always be the case. |
This will be one of the first items on the agenda for the SIG, then. I think it is important to provide customization opportunities at that interface.
And this is why it is important to have an
I'm not sure a custom getter implementation is the answer. As the getter/setter are optional and not implemented in all languages, it wouldn't provide a consistent solution. I also don't think it necessarily helps solve the custom event interaction use case. There would still need to be something that can underly that getter and get fields from the incoming event, which starts to look a whole lot like
The example code I provided above is a straw man for discussion, not a complete implementation. If we were to take this route then the |
This issue raised two concerns in my mind. FIRST I want to say that the concerns on propagators @Aneurysm9 raise are similar to discussions we've been having around OpenCensus binary propagation format and gRPC instrumentation. Specifically, the current specification around propagators (in OpenCensus) for binary propagation is a bit awkward. There seems to be never an opportunity where you'd not use the OC binary format and a different binary format. IN this case it seems the instrumetnation (gRPC) is tied to the propagation format. However, the key here is we want users to be able to provide prioritization of which propagation format they want via configuration. The fact that the semconv tries to undo this (v.s provide a default), I feel, may be an overextension of the semconv specification. Specifically, should the existence of x-ray tracing always overrule other propagation? I'm ok if that's the default, but I'd be concerned if there were no ability to configure differently. My second concern here is around the propagation specification itself. As @Aneurysm9 says, we conflate "how to fill out key-value pairs" with the prioritization of which instrumentation to attach to. However, unlike the conclusion in this thread, I actually think the ability for users to prioritize (even instrumentation specific propagation formats) is important. I think we should do something here, possibly with the Propagation specification going forward. I see this necessary for gRPC + OpenCensus binary propagation in addition to what I'm seeing in AWS x-ray, both of which don't fix the current TextMapPropagator model. |
We discussed this in the FAAS SIG meeting today and came to the following conclusion:
|
Because there has been confusion on my final comment above, I want to point out that what was implemented in the spec in #3166 is different but not reflected in this issue because the SIG decided to move forward with span links on the Jan 31 meeting (but failed to notate here) which is why #3166 was focused on span links instead. |
What are you trying to achieve?
Propagation should follow the configuration provided to
OpenTelemetry
, not be dynamically influenced by external state.Currently, the spec states that instrumentation should first evaluate x-ray propagation from the
_X_AMZN_TRACE_ID
environment variable before propagating the context from the lambda event.This means that propagation could be working fine as configured, but if someone enables x-ray tracing, the resulting spans will be broken into separate traces (inconsistently depending on the x-ray sampling rate).
Options to resolve this:
I'm sure there are other ideas and look forward to discussing in the SIG.
Additional context.
https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/semantic_conventions/instrumentation/aws-lambda.md#determining-the-parent-of-a-span
The text was updated successfully, but these errors were encountered: