Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AI tracking does not work properly in Python's asynchronous generator scenarios. #3823

Closed
uraurora opened this issue Nov 25, 2024 · 7 comments

Comments

@uraurora
Copy link

Environment

SaaS (https://sentry.io/)

Steps to Reproduce

  1. First, I use an HTTP service to obtain a streaming response (SSE) from an interface, and my local interface is mainly used to relay data and report its token consumption information.
  2. Locally, I use Python FastAPI and employ a Python asynchronous generator to yield each event.
  3. I created a span within the asynchronous generator and used the decorator ai_track on the function. I used with sentry_sdk.start_span(op="ai.chat_completions.create.xxx", name="xxx") as span, and I'm not sure if the op value is set correctly.

Expected Result

I hope the LLM Monitoring works well, but seems only no-stream api does

Actual Result

The stream api does not show anything. I'm not sure whether there's an issue with my configuration or if this method of invocation is not currently supported.

Product Area

Insights

Link

https://moflow.sentry.io/insights/ai/llm-monitoring/?project=4508239351447552&statsPeriod=24h

DSN

No response

Version

2.19.0

@getsantry
Copy link

getsantry bot commented Nov 25, 2024

Assigning to @getsentry/support for routing ⏲️

@InterstellarStella InterstellarStella transferred this issue from getsentry/sentry Nov 26, 2024
@getsantry getsantry bot moved this from Waiting for: Support to Waiting for: Product Owner in GitHub Issues with 👀 3 Nov 26, 2024
@szokeasaurusrex
Copy link
Member

szokeasaurusrex commented Nov 26, 2024

Hi @uraurora, thank you for opening this issue.

I am having trouble understanding what you are trying to do, and what the problem is. Could you please provide specific steps on how to reproduce the problem? If possible, please provide a code snippet that we can run, so that we can see what you are trying to do.

@uraurora
Copy link
Author

uraurora commented Nov 27, 2024

Hi @uraurora, thank you for opening this issue.

I am having trouble understanding what you are trying to do, and what the problem is. Could you please provide specific steps on how to reproduce the problem? If possible, please provide a code snippet that we can run, so that we can see what you are trying to do.

Hi, In simple terms, I use FastAPI as the backend, and at the same time, I want to record token consumption in the LLM interface with streaming responses, but after calling the interface, it seems that there are no related displays in the Sentry dashboard(Insights-AI-LLM Monitoring). The code is as follows:

@ai_track("sentry-ai-track-test-pipeline")
async def stream():
    # assume this is a llm stream call
    with sentry_sdk.start_span(op="ai.chat_completions.create.xxx", name="sentry-ai-track-test") as span:
        token = 0
        for i in range(10):
            token += 1
            yield f"{i}"

        record_token_usage(span, total_tokens=token)


@router.post(
    "/xxx/xxx",
    response_class=EventSourceResponse,
    status_code=status.HTTP_200_OK,
)
async def sse_api(
) -> EventSourceResponse:
    return EventSourceResponse(stream())

Image

@getsantry getsantry bot moved this from Waiting for: Community to Waiting for: Product Owner in GitHub Issues with 👀 3 Nov 27, 2024
@antonpirker
Copy link
Member

Can you link us a transaction that is in the "Performance" tab on Sentry.io that contains the spans ai.chat_completions.create.* that you are creating?

In general the spans you create must contain the data described here: https://develop.sentry.dev/sdk/telemetry/traces/modules/llm-monitoring/

If we have a link to a transaction, we can see if the spans in this transaction have the correct format.

@uraurora
Copy link
Author

uraurora commented Nov 28, 2024

Can you link us a transaction that is in the "Performance" tab on Sentry.io that contains the spans ai.chat_completions.create.* that you are creating?

In general the spans you create must contain the data described here: https://develop.sentry.dev/sdk/telemetry/traces/modules/llm-monitoring/

If we have a link to a transaction, we can see if the spans in this transaction have the correct format.

yes, here it is, Indeed, I suspect that I used incorrect span information.

trace

llm monitoring

@getsantry getsantry bot moved this to Waiting for: Product Owner in GitHub Issues with 👀 3 Nov 28, 2024
@antonpirker
Copy link
Member

Hey @uraurora !

Yes, some of the span data is not correct. I have created a small sample script that creates correct spans: https://github.com/antonpirker/testing-sentry/blob/main/test-llm-manual-instrumentation/main.py

One bigger thing is that ""ai.chat_completions.create.xxx" is not allowed, instead of xxx it needs to be one of openai, cohere, langchain, huggingface_hub.

Also notice that the pipeline_name var is used twice in the example. This is how Sentry is matching spans together.

If you also want to have the dollar amount spent than it is important to set the ai.model_id.

Hope this helps!

@uraurora
Copy link
Author

uraurora commented Nov 29, 2024

Great! thank you for your assistance. Now the token consumption can display well

@getsantry getsantry bot moved this from Waiting for: Community to Waiting for: Product Owner in GitHub Issues with 👀 3 Nov 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

No branches or pull requests

3 participants