Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

concurrent asyncio tasks and xray are not working with Lambda #310

Closed
abivolmv opened this issue Jul 29, 2021 · 11 comments · Fixed by #340
Closed

concurrent asyncio tasks and xray are not working with Lambda #310

abivolmv opened this issue Jul 29, 2021 · 11 comments · Fixed by #340
Labels

Comments

@abivolmv
Copy link

abivolmv commented Jul 29, 2021

Hello,
I am opening this issue on request of @NathanielRN after a discussion with him in #203 .

The problem is that when using aioboto3, asyncio (with gather or wait) and xray in Lambda we either get an exception or no subsegments showing calls to S3 or SQS.
Here is the last code snippet from that issue:

import asyncio
from io import BytesIO

import aioboto3
from aws_xray_sdk.core import xray_recorder, patch_all
from aws_xray_sdk.core.async_context import AsyncContext

# here I tried adding your solution (not added when solution to use AsyncContext() is used below)
# xray_recorder.configure(service='repro_xray_issue')  # uncomment this for second solution trial

patch_all()


def lambda_handler(a,b):
    asyncio.run(main())


async def main():
    # here I tried adding your solution (not added when solution to remove AsyncContext() is used above)
    xray_recorder.configure(service='repro_xray_issue', context=AsyncContext())    # comment this for second solution trial
    async with xray_recorder.in_segment_async('my_segment_name') as segment:  # comment this for second solution trial
        filelike1 = BytesIO()
        filelike2 = BytesIO()
        res1, res2 = await asyncio.gather(s3_get(filelike1), s3_get(filelike2))  # .wait() doesn't work either


@xray_recorder.capture_async('s3_get')  # tried with this and without also
async def s3_get(filelike):
    async with aioboto3.Session().client('s3') as s3:
        return await s3.download_fileobj('s3-validation-files-003', 'test.txt', filelike)
@willarmiros willarmiros added the bug label Aug 2, 2021
@willarmiros
Copy link
Contributor

Hi @abivolmv,

Thanks for creating this separate issue and posting some reproduction code. We have this as an investigation item in our backend and will follow up when we have bandwidth.

@NathanielRN
Copy link
Contributor

Adding some notes I took down during my quick investigation of this issue:

It looks like our AsyncContextManager is able to __aexit__ before the s3 tasks initiated by asyncio finish. Maybe it's not put on the taskLoop correctly OR it just finishes faster so that they finish before the other tasks finish creating and ending their sub segments. A reminder that this bug sometimes happens and sometimes does not happen.

I'm referring lines specifically to these lines of the async_recorder.py for AsyncContextManager:

async def __aexit__(self, exc_type, exc_val, exc_tb):
return self.__exit__(exc_type, exc_val, exc_tb)

PEP 484 says that the aexit function should be returning an "awaitable", but I am not sure if our code is returnning just that.

I managed to avoid the "Traceback Exception" by changing the code like this:

class AsyncSegmentContextManager(SegmentContextManager):
    async def __aenter__(self):
        return self.__enter__()

    async def __aexit__(self, exc_type, exc_val, exc_tb):
        async def nested():
            self.__exit__(exc_type, exc_val, exc_tb)

        task = await asyncio.gather(nested())

        return task

The traces show up on X-Ray but they are sometimes blank with no data in them!

This issues requires investigating how asyncio works. Specifically, we need to confirm that we "schedule the self.exit task" on the asyncio loop instead of running it immediately (so that the previous async calls have a chance to finish) OR that the task doesn't get pulled off the loop until the previous ones to make calls to S3 run to completion.

@abivolmv
Copy link
Author

abivolmv commented Feb 1, 2022

Hello guys,
just wanted to spark some interest in this issue as we have Lambda timing out sometimes (very rarely) but can't see why since Xray won't show much due to the aforementioned issue.

@abivolmv
Copy link
Author

Thanks for the fix , I will try and see if it works now for me too

@abivolmv
Copy link
Author

Oh, I have to wait for a new release to pypi. Or how can I test before that ? Also when is the new release going to happen as I see it's not so often.

@NathanielRN
Copy link
Contributor

Hey @abivolmv, thanks for your patience! Can you see release 2.10.0 now? https://github.com/aws/aws-xray-sdk-python/releases/tag/2.10.0

Let us know if it works for you 🙂

@abivolmv
Copy link
Author

abivolmv commented Jul 22, 2022

Hi @NathanielRN appreciate your and your colleagues' effort on this.
I tried the new version and I can see the segments in a proper way in xray console when I test locally or in Lambda.
But I notice it creates a new segment that is not tied to the Lambda one.

So when I have the use case API GW -> Lambda -> S3 I have 2 traces , which beats the purpose of xray as I cannot see the start and end of the same workflow in the same trace. Maybe I am not integrating correctly the solution ? Here is how it looks like in Lambda:

import asyncio
from io import BytesIO
import aioboto3
from aws_xray_sdk.core import xray_recorder, patch_all
from aws_xray_sdk.core.async_context import AsyncContext

patch_all()
aio_session = aioboto3.Session()


def lambda_handler(a, b):
    asyncio.run(main())


async def main():
    # here I tried adding your solution (not added when solution to remove AsyncContext() is used above)
    xray_recorder.configure(context=AsyncContext())
    async with xray_recorder.in_segment_async('my_segment_name') as segment:
        filelike1 = BytesIO()
        filelike2 = BytesIO()
        res1, res2, res3, res4 = await asyncio.gather(s3_get(filelike1), s3_get(filelike2))


@xray_recorder.capture_async('s3_get')
async def s3_get(filelike):
    async with aio_session.client('s3') as s3:
        return await s3.download_fileobj('s3-validation-files-003', 'test.txt', filelike)

@NathanielRN
Copy link
Contributor

Hi @abivolmv, I believe this may be an issue of propagation. I believe you have to instrument the Lambda Handler to read the incoming X-Ray Tracing header coming in from APIGW and used by Lambda. Otherwise the X-Ray SDK in your Lambda will create its own trace.

See more:
https://docs.aws.amazon.com/xray/latest/devguide/xray-concepts.html#xray-concepts-tracingheader

@abivolmv
Copy link
Author

abivolmv commented Jul 26, 2022

@NathanielRN this, propagation, happens automatically when I don't use

xray_recorder.configure(context=AsyncContext())
    async with xray_recorder.in_segment_async('my_segment_name') as segment:

Something happens here that detaches the API incoming trace from Lambda custom trace.
We can see that the trace is reaching the Lambda handler in the other parallel trace :

image

I ended up doing this, though it feels counter-intuitive, it shows one full trace:

    trace_id = xray_recorder.get_trace_entity().trace_id
    parent_id = xray_recorder.current_segment().id
    # enable xray to work with async
    # even though Lambda has its own segment we need to create one async segment
    xray_recorder.configure(context=AsyncContext())
    async with xray_recorder.in_segment_async('add_brochure', traceid=trace_id, parent_id=parent_id) as segment:
       ...

@NathanielRN
Copy link
Contributor

Hm yeah I expect _in_segment_async to do that automatically. Would you mind filing a new issue with this so we can keep track of it? Otherwise happy that you have a work around and thank you for being willing to share your findings to make this SDK better! 😄

@abivolmv
Copy link
Author

Unfortunately the workaround above is only working sometimes. I described it more in the new issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants