Do we need to distinguish client side and server side llm call? #1079

gyliu513 · 2024-05-28T00:36:23Z

Area(s)

area:gen-ai

What happened?

Description

There is a PR trying to enable vllm support metrics as well and it is adopting this semantic convention as well vllm-project/vllm#4687 , here the question is vllm is server side, but I can see seems now the llm semantic convention are manily trying to instrument client side code like https://github.com/traceloop/openllmetry/tree/main/packages, do we need to distinguish the client side and server side semantic convetion? Thanks!

@nirga @lmolkova ^^

Semantic convention version

NA

Additional context

No response

lmolkova · 2024-05-28T20:19:37Z

Adding @SergeyKanzhelev who might be interested in GCP server LLM metrics.

The assumption is that client and server would have different information available.

E.g. gen_ai.client.operation.duration is different from gen_ai.server.operation.duration - depending on the client network, the difference can be significant.
The same request that's timed out on the client could be successful on the server side resulting in different duration, error rate, usage and other metrics.

Client and server metrics might have different attributes. E.g. server might have information about pricing tier, region, availability zone that is not available on the client, but are very useful to know.

Therefore we either need:

different metrics names.
extra attribute(s) that distinguish client from server.

Other otel semantic conventions use option 1 and this is a good reason for gen_ai semconv to also defined different metrics for client and server.
But nothing stops us from adding gen_ai server semantic conventions and reusing anything we can between client and server.

gyliu513 · 2024-06-04T14:48:07Z

@lmolkova @nirga there is an issue tracking the vLLM metrics vllm-project/vllm#5041 and also a metrics proposal at https://docs.google.com/document/d/1SpSp1E6moa4HSrJnS4x3NpLuj88sMXr2tbofKlzTZpk/edit?resourcekey=0-ob5dR-AJxLQ5SvPlA4rdsg#heading=h.qmzyorj64um1

lmolkova · 2024-08-08T15:37:31Z

I believe this can be resolved:

spans have client kind (server spans will have server kind and may have different attributes)
metrics have client/server in the name

@gyliu513 please comment if you believe there is something else we need to do on this issue (and feel free to reopen it)

gyliu513 added bug Something isn't working triage:needs-triage labels May 28, 2024

github-actions bot assigned jsuereth May 28, 2024

github-actions bot added the area:gen-ai label May 28, 2024

gyliu513 mentioned this issue May 29, 2024

fix: llm metrics naming + views traceloop/openllmetry#1121

Merged

lmolkova unassigned jsuereth May 29, 2024

lmolkova added this to GenAI Semantic Conventions and Instrumentation libraries May 29, 2024

lmolkova mentioned this issue May 30, 2024

[chore] Add blank issue template #1097

Closed

gyliu513 mentioned this issue Jun 4, 2024

[Feature]: Additional metrics to enable better autoscaling / load balancing of vLLM servers in Kubernetes vllm-project/vllm#5041

Closed

gyliu513 mentioned this issue Jun 4, 2024

Semantic conventions for LLM model server metrics #1102

Closed

drewby moved this to In Progress in GenAI Semantic Conventions and Instrumentation libraries Jun 12, 2024

joaopgrassi removed the triage:needs-triage label Jul 9, 2024

lmolkova closed this as completed Aug 8, 2024

github-project-automation bot moved this from In Progress to Done in GenAI Semantic Conventions and Instrumentation libraries Aug 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Do we need to distinguish client side and server side llm call? #1079

Do we need to distinguish client side and server side llm call? #1079

gyliu513 commented May 28, 2024 •

edited

Loading

lmolkova commented May 28, 2024 •

edited

Loading

gyliu513 commented Jun 4, 2024

lmolkova commented Aug 8, 2024

Do we need to distinguish client side and server side llm call? #1079

Do we need to distinguish client side and server side llm call? #1079

Comments

gyliu513 commented May 28, 2024 • edited Loading

Area(s)

What happened?

Description

Semantic convention version

Additional context

lmolkova commented May 28, 2024 • edited Loading

gyliu513 commented Jun 4, 2024

lmolkova commented Aug 8, 2024

gyliu513 commented May 28, 2024 •

edited

Loading

lmolkova commented May 28, 2024 •

edited

Loading