Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lambda exec wrapper calls upstream OTel Python auto instr script #164

Conversation

NathanielRN
Copy link
Contributor

Description

@wangzlei and I worked together to figure out how to mostly auto-instrument Lambda functions using the upstream opentelemetry-instrument auto-instrumentation package.

In doing so, we found it best to re-write otel-instrument as a bash script.

We made two major updates

Update the PYTHONPATH in otel-instrument so we can call opentelemtry-instrument

The auto-instrumentation package counts on 2 things:

  • the opentelemetry package
  • locating all the python packages the user wants to instrument

Both these 2 things require us modifying the environment variable PYTHONPATH right away in the otel-instrument script. AWS Lambda will add the correct paths but it does it too late. (It only does it once it calls it's originally intended entry point of python3 /var/runtime/bootstrap.py).

Update the otel_wrapper.py to only call the AwsLambdaInstrumentor

All of the instrumentation is done in sitecustomize.py by opentelemetry-instrument. However, the way the Lambda Handler is imported in the AWS Lambda bootstrap.py file which is run after sitecustomize.py CLEARS any instrumentation done on lambda_function.lambda_handler. That's why we keep otel_wrapper.py around. So that bootstrap.py can call it, do any destructive imports it needs to do, and then call AwsLambdaInstrumentor.instrument() explicitly. This way we know the Lambda function is instrumented, we can import, and give bootstrap.py the Lambda Handler that we know is instrumented.

Future Work

We would ideally like to modify AWS Lambda's boostrap.py or investigate how we can get rid of otel_wrapper.py and have all the instrumentation be finished in sitecustomize.py such that bootstrap.py can just import the normal lambda handler at the _HANDLER environment variable without destroying the import at all.

Fixes #152

@NathanielRN NathanielRN force-pushed the exec-script-calls-upstream-auto-instrument branch 2 times, most recently from 4e5fa9a to 5916901 Compare October 25, 2021 22:02
# script can find them (it needs to find `opentelemetry` to find the auto
# instrumentation `run()` method later)

if [ -z ${PYTHONPATH} ]; then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For any path type variabe it's conventional to not check for presence, empty strings are fine (they should be made fine for the resource attributes variable in the python SDK separately too at some point)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay sounds good! Yeah I didn't know if user would dislike us modifying a variable they already set, but then again they're probably using this script BECAUSE they want us to handle all these little settings 😛

Changed it to this!

export PYTHONPATH="$LAMBDA_LAYER_PKGS_DIR:$PYTHONPATH";

And I guess you mean in the future someone can do this:

export OTEL_RESOURCE_ATTRIBUTES="$LAMBDA_RESOURCE_ATTRIBUTES,$OTEL_RESOURCE_ATTRIBUTES";

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks again for the prompt!

I looked into this and am fairly certain OTel Python SDK currently does handle empty Resource Attributes.

if not (key and isinstance(key, str)):
    _logger.warning("invalid key `%s`. must be non-empty string.", key)
    return None

They even have a test to protect against this:

def test_invalid_resource_attribute_values(self):
    resource = resources.Resource(
        {
            resources.SERVICE_NAME: "test",
            "non-primitive-data-type": {},
            "invalid-byte-type-attribute": b"\xd8\xe1\xb7\xeb\xa8\xe5 \xd2\xb7\xe1",
            "": "empty-key-value",
            None: "null-key-value",
            "another-non-primitive": uuid.uuid4(),
        }
    )
    self.assertEqual(
        resource.attributes,
        {
            resources.SERVICE_NAME: "test",
        },
    )
    self.assertEqual(len(resource.attributes), 1)

So I'll simplify this even further!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually was wrong about this... I've made a PR to fix this upstream: open-telemetry/opentelemetry-python#2256

And will follow up with a fix #173

python/src/otel/otel_sdk/otel-instrument Outdated Show resolved Hide resolved
python/src/otel/otel_sdk/otel-instrument Outdated Show resolved Hide resolved
python/src/otel/otel_sdk/otel_wrapper.py Show resolved Hide resolved
python/src/otel/otel_sdk/otel_wrapper.py Show resolved Hide resolved
@NathanielRN NathanielRN force-pushed the exec-script-calls-upstream-auto-instrument branch from 3973c17 to a3badc1 Compare October 26, 2021 22:50
@NathanielRN NathanielRN force-pushed the exec-script-calls-upstream-auto-instrument branch from a3badc1 to f5b9cdf Compare October 26, 2021 22:52
@NathanielRN NathanielRN marked this pull request as ready for review October 26, 2021 22:56
@NathanielRN NathanielRN requested a review from anuraaga October 26, 2021 22:56
python/src/otel/tests/test_otel.py Outdated Show resolved Hide resolved
@NathanielRN NathanielRN force-pushed the exec-script-calls-upstream-auto-instrument branch from de19b2a to a897770 Compare October 27, 2021 14:28
Comment on lines +93 to +98
# NOTE: Because we run as a subprocess, the python packages are NOT patched
# with instrumentation. In this test we just make sure we can complete auto
# instrumentation without error and the correct environment variabels are
# set. A future improvement might have us run `opentelemetry-instrument` in
# this process to imitate `otel-instrument`, but our lambda handler does not
# call other instrumented libraries so we have no use for it for now.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just want to flag this comment, I tested it and the subprocess does not patch the libraries in the parent process as I initially assumed 😞 I think we should merge as is, and the only suggestion I can think of is to call opentelemetry-instrument ourselves in this process if we really want the packages to be instrumented.

Either way the check=True below will help us make sure we can run both otel-instrument and opentelemetry-instrument through to completion without any errors.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's weird about this is that the TracerProvider must be getting initialized by the child opentelemetry-instrument call, but when I make a call to botocore for example I should be seeing more spans and yet I do not...

Copy link
Contributor Author

@NathanielRN NathanielRN Oct 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I understand why now. We use the from opentelemetry.test.test_base import TestBase which initializes the TracerProvider at the setupClass stage.

We can fix this in another PR. See Issue #168.

Copy link
Contributor

@wangzlei wangzlei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Use opentelemetry-instrument script to invoke python app
3 participants