-
Notifications
You must be signed in to change notification settings - Fork 589
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
End to end tracing support - Lambda #983
End to end tracing support - Lambda #983
Conversation
Co-authored-by: garrettwegan <[email protected]>
@Aneurysm9 @MrAlias Please take a look whenever you get a chance. This PR is derived from #882 |
tp := sdktrace.NewTracerProvider( | ||
sdktrace.WithResource(res), | ||
) | ||
lambda.Start(<some lambda handler>) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the example need either otel.SetTracerProvider(tp)
or passing tp
explicitly into the Lambda handler? Otherwise the variable is unused.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, made a suggestion
|`faas.name` | MyLambdaFunction | ||
|`faas.version` | $LATEST | ||
|
||
Of note, `faas.id` and `cloud.account.id` are not set by the Lambda resource detector because they are not available outside a Lambda invocation. For this reason, when using the AWS Lambda Instrumentation these attributes are set as additional span attributes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this be solved through lazy initialization?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not without asking the user to add more code. The faas.id
and cloud.account.id
attributes are only available inside of the handler, and our solution is meant to be strictly a wrapper of the handler such that we can't actually do anything inside the wrapper. So we could ask the user to lazy-initialize the tracer provider in their handler, but that's a much worse experience.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would the user have to do that lazy-initialization, rather than the framework? E.g., at this point it seems like you'd have access to those invocation-scoped variables. It's true that you'd have to do separately for each possible reflective type of the handler function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the problem is that we will then couple the Lambda resource detector and the Lambda instrumentation, which ought to be two separate components. The Lambda instrumentation does indeed have access to those invocation-scoped attributes, and is able to initialize the tracer provider. But the Lambda resource detector must operate only with what is available in the static environment, so if a user wants to initialize a custom tracer provider outside their handler as seen in the example it will work the same.
I agree that it's unfortunate the lambda resource detector doesn't capture these attributes, but I believe this is a limitation of Lambda itself more than any individual instrumentation, which is why the span attribute is called out in the spec and there's talks of not even marking faas.id as a resource attribute at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for explaining. I was commenting based on "what would an operator find most useful" vs "what is a clean way to factor the code." Given the ambiguity / ongoing discussion around faas.id, I think this code makes a reasonable choice.
|`faas.name` | MyLambdaFunction | ||
|`faas.version` | $LATEST | ||
|
||
Of note, `faas.id` and `cloud.account.id` are not set by the Lambda resource detector because they are not available outside a Lambda invocation. For this reason, when using the AWS Lambda Instrumentation these attributes are set as additional span attributes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not without asking the user to add more code. The faas.id
and cloud.account.id
attributes are only available inside of the handler, and our solution is meant to be strictly a wrapper of the handler such that we can't actually do anything inside the wrapper. So we could ask the user to lazy-initialize the tracer provider in their handler, but that's a much worse experience.
tp := sdktrace.NewTracerProvider( | ||
sdktrace.WithResource(res), | ||
) | ||
lambda.Start(<some lambda handler>) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lambda.Start(<some lambda handler>) | |
otel.SetTracerProvider(tp) | |
lambda.Start(myHandler) |
tp := sdktrace.NewTracerProvider( | ||
sdktrace.WithResource(res), | ||
) | ||
lambda.Start(<some lambda handler>) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, made a suggestion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realize this PR has been needing reviews for some time, it might be possible to get the code here merged quicker if we break up the PR.
go.opentelemetry.io/contrib/detectors/aws/lambda v0.22.0 | ||
go.opentelemetry.io/contrib/instrumentation/github.com/aws/aws-lambda-go/otellambda v0.22.0 | ||
go.opentelemetry.io/contrib/propagators/aws v0.22.0 | ||
go.opentelemetry.io/otel v1.0.0-RC2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should these be updated to 1.0.0 or 1.0.1?
github.com/aws/aws-lambda-go v1.24.0 | ||
github.com/stretchr/testify v1.7.0 | ||
go.opentelemetry.io/contrib v0.22.0 | ||
go.opentelemetry.io/otel v1.0.0-RC2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should these be updated to 1.0.0 or 1.0.1?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like it still needs to be updated?
detectors/aws/lambda/go.mod
Outdated
require ( | ||
github.com/davecgh/go-spew v1.1.1 // indirect | ||
github.com/stretchr/testify v1.7.0 | ||
go.opentelemetry.io/otel v1.0.0-RC2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please update the versions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bhautikpip if you have a minute would you be able to update these versions to stable and rebase the PR so conflicts/CI checks are resolved?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@codeboten @willarmiros updated the version and resolved the conflicts. It should pass the CI checks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
I agree and wish it were made in a more modular fashion, but as it stands there are only 2 components in this PR - the Lambda instrumentation and the Lambda resource detector. We could separate them but the detector is only 5 files, and so I'm not sure a 25 file vs. 30 file PR would be more digestible & worth the effort :( Open to other suggestions though if I'm not realizing something. |
@dashpole can you please review and approve. We are blocked on this PR from being merged due to inadequate number of approver reviews. 🙏 |
| Options | Input Type | Description | Default | | ||
| --- | --- | --- | --- | | ||
| `WithTracerProvider` | `trace.TracerProvider` | Provide a custom `TracerProvider` for creating spans. Consider using the [AWS Lambda Resource Detector][lambda-detector-url] with your tracer provider to improve tracing information. | `otel.GetTracerProvider()` | ||
| `WithFlusher` | `otellambda.Flusher` | This instrumentation will call the `ForceFlush` method of its `Flusher` at the end of each invocation. Should you be using asynchronous logic (such as `sddktrace's BatchSpanProcessor`) it is very import for spans to be `ForceFlush`'ed before [Lambda freezes](https://docs.aws.amazon.com/lambda/latest/dg/runtimes-context.html) to avoid data delays. | `Flusher` with noop `ForceFlush` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would anyone ever not want to flush at the end of an invocation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC we didn't want to force Lambda customers using async patterns to incur additional execution time from this instrumentation, but I'm not 100% sure what the reasoning behind making the default no-op was. Perhaps @Aneurysm9 remembers better.
|
||
// wrap lambda handler function | ||
func main() { | ||
lambda.Start(otellambda.WrapHandlerFunction(HandleRequest)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps InstrumentHandler()
is a better description of what this does? Or NewHandler()
would match otelhttp's naming.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure about NewHandler
since we're not instantiating a new function handler, we are indeed just wrapping it. I think InstrumentHandler()
makes sense but would like to have @bhautikpip weigh in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep InstrumentHandler()
looks better to me.
// init aws config | ||
cfg, err := awsConfig.LoadDefaultConfig(ctx) | ||
if err != nil { | ||
panic("configuration error, " + err.Error()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we return this error instead of panicing?
input := &s3.ListBucketsInput{} | ||
result, err := s3Client.ListBuckets(ctx, input) | ||
if err != nil { | ||
fmt.Printf("Got an error retrieving buckets, %v", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we return the error here?
fmt.Printf("Got an error retrieving buckets, %v", err) | ||
} | ||
|
||
fmt.Println("Buckets:") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Probably better to use log.Info
, rather than fmt for logging. Here and elsewhere in this file.
} | ||
req, err := http.NewRequestWithContext(ctx, http.MethodGet, "https://api.github.com/repos/open-telemetry/opentelemetry-go/releases/latest", nil) | ||
if err != nil { | ||
fmt.Printf("failed to create http request, %v\n", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally, the function handling the error should log, rather than logging and returning the error. Same applies below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you suggesting to remove error handling? Sorry did not understand your comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm saying you (probably?) don't need the print statement, since you are returning the error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah that's fair but I think we can keep this as it probably might be useful for debugging or kind of helps understand better where the error is happening.
detector := lambdadetector.NewResourceDetector() | ||
res, err := detector.Detect(ctx) | ||
if err != nil { | ||
fmt.Printf("failed to detect lambda resources: %v\n", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we return (ideally with log.Fatalf) here?
github.com/aws/aws-lambda-go v1.24.0 | ||
github.com/stretchr/testify v1.7.0 | ||
go.opentelemetry.io/contrib v0.22.0 | ||
go.opentelemetry.io/otel v1.0.0-RC2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like it still needs to be updated?
ctx = i.configuration.Propagator.Extract(ctx, mc) | ||
|
||
var span trace.Span | ||
spanName := os.Getenv("AWS_LAMBDA_FUNCTION_NAME") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this be configurable through config options, rather than env vars? Or is this guaranteed to be set in lambda?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's guaranteed to be set: https://docs.aws.amazon.com/lambda/latest/dg/configuration-envvars.html#configuration-envvars-runtime
…y-go-contrib into tracing-support-lambda
…y-go-contrib into tracing-support-lambda
@dashpole addressed your comments! Feel free to take a look again. |
@MrAlias @Aneurysm9 now that this is approved any chance we can get it merged? |
See for example use case: https://github.com/garrettwegan/opentelemetry-lambda/blob/main/go/sample-apps/function/function.go