Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do we need to distinguish client side and server side llm call? #1079

Closed
gyliu513 opened this issue May 28, 2024 · 3 comments
Closed

Do we need to distinguish client side and server side llm call? #1079

gyliu513 opened this issue May 28, 2024 · 3 comments
Labels
area:gen-ai bug Something isn't working

Comments

@gyliu513
Copy link
Member

gyliu513 commented May 28, 2024

Area(s)

area:gen-ai

What happened?

Description

There is a PR trying to enable vllm support metrics as well and it is adopting this semantic convention as well vllm-project/vllm#4687 , here the question is vllm is server side, but I can see seems now the llm semantic convention are manily trying to instrument client side code like https://github.com/traceloop/openllmetry/tree/main/packages, do we need to distinguish the client side and server side semantic convetion? Thanks!

@nirga @lmolkova ^^

Semantic convention version

NA

Additional context

No response

@lmolkova
Copy link
Contributor

lmolkova commented May 28, 2024

Adding @SergeyKanzhelev who might be interested in GCP server LLM metrics.

The assumption is that client and server would have different information available.

E.g. gen_ai.client.operation.duration is different from gen_ai.server.operation.duration - depending on the client network, the difference can be significant.
The same request that's timed out on the client could be successful on the server side resulting in different duration, error rate, usage and other metrics.

Client and server metrics might have different attributes. E.g. server might have information about pricing tier, region, availability zone that is not available on the client, but are very useful to know.

Therefore we either need:

  1. different metrics names.
  2. extra attribute(s) that distinguish client from server.

Other otel semantic conventions use option 1 and this is a good reason for gen_ai semconv to also defined different metrics for client and server.
But nothing stops us from adding gen_ai server semantic conventions and reusing anything we can between client and server.

@gyliu513
Copy link
Member Author

gyliu513 commented Jun 4, 2024

@lmolkova
Copy link
Contributor

lmolkova commented Aug 8, 2024

I believe this can be resolved:

  • spans have client kind (server spans will have server kind and may have different attributes)
  • metrics have client/server in the name

@gyliu513 please comment if you believe there is something else we need to do on this issue (and feel free to reopen it)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:gen-ai bug Something isn't working
Development

No branches or pull requests

4 participants