LLM common metrics for Generative AI #955

drewby · 2024-04-24T06:41:52Z

Fixes #811

Changes

This adds initial metric definitions to the current set of gen_ai semantic conventions. These initial two metrics (gen_ai.usage.tokens and gen_ai.request.duration) are a minimal set to get started, and more can be added with future PRs.

Merge requirement checklist

CONTRIBUTING.md guidelines followed.
Change log entry added, according to the guidelines in When to add a changelog entry.
- If your PR does not need a change log, start the PR title with [chore]
schema-next.yaml updated with changes to existing conventions.

docs/gen-ai/llm-metrics.md

model/metrics/gen-ai.yaml

model/registry/gen-ai.yaml

model/metrics/gen-ai.yaml

docs/gen-ai/llm-metrics.md

cartermp

A question/observation:

Should we go through and just use "Gen AI Model" instead of "LLM" throughout the content of this document? Naming attributes genai.* but then having the descriptions talk about an LLM feels a little inconsistent to me now.

I think we're already at the point where the same model supports multi-modal inputs and outputs. Consider the following in Claude's API reference:

Starting with Claude 3 models, you can also send image content blocks:

{"role": "user", "content": [
  {
    "type": "image",
    "source": {
      "type": "base64",
      "media_type": "image/jpeg",
      "data": "/9j/4AAQSkZJRg...",
    }
  },
  {"type": "text", "text": "What is in this image?"}
]}

And with OpenAI it's a little less straightforward, but still possible:

An array of content parts with a defined type, each can be of type `text` or `image_url` when passing in images. You can pass multiple images by adding multiple `image_url` content parts. Image input is only supported when using the `gpt-4-visual-preview` model.

It's orthogonal to this PR, but maybe it's good to start here and not "limit" ourselves by using the LLM terminology, since it's usually associated with just text interpretation and generation?

drewby · 2024-04-26T04:30:28Z

A question/observation:

Should we go through and just use "Gen AI Model" instead of "LLM" throughout the content of this document? Naming attributes genai.* but then having the descriptions talk about an LLM feels a little inconsistent to me now.

I think we're already at the point where the same model supports multi-modal inputs and outputs. Consider the following in Claude's API reference:

Starting with Claude 3 models, you can also send image content blocks:
{"role": "user", "content": [
  {
    "type": "image",
    "source": {
      "type": "base64",
      "media_type": "image/jpeg",
      "data": "/9j/4AAQSkZJRg...",
    }
  },
  {"type": "text", "text": "What is in this image?"}
]}
And with OpenAI it's a little less straightforward, but still possible:
An array of content parts with a defined type, each can be of type `text` or `image_url` when passing in images. You can pass multiple images by adding multiple `image_url` content parts. Image input is only supported when using the `gpt-4-visual-preview` model.
It's orthogonal to this PR, but maybe it's good to start here and not "limit" ourselves by using the LLM terminology, since it's usually associated with just text interpretation and generation?

I agree both addressing this soon and doing it separate from this PR. I've updated anything I could that does not impact Spans yet (keeping this as the metrics PR). Let's create another PR to update the other references to LLM.

docs/gen-ai/gen-ai-metrics.md

model/registry/gen-ai.yaml

model/metrics/gen-ai.yaml

model/registry/gen-ai.yaml

model/metrics/gen-ai.yaml

docs/gen-ai/gen-ai-metrics.md

lmolkova

LGTM! Left a few minor-ish comments

model/registry/gen-ai.yaml

model/metrics/gen-ai.yaml

drewby · 2024-05-25T00:09:56Z

LGTM! Left a few minor-ish comments

Thanks! All resolved.

drewby requested review from a team April 24, 2024 06:41

github-actions bot assigned reyang Apr 24, 2024

nirga reviewed Apr 24, 2024

View reviewed changes

docs/gen-ai/llm-metrics.md Outdated Show resolved Hide resolved

docs/gen-ai/llm-metrics.md Outdated Show resolved Hide resolved

cartermp reviewed Apr 24, 2024

View reviewed changes

docs/gen-ai/llm-metrics.md Outdated Show resolved Hide resolved

lmolkova reviewed Apr 25, 2024

View reviewed changes

model/metrics/gen-ai.yaml Show resolved Hide resolved

lmolkova reviewed Apr 25, 2024

View reviewed changes

model/metrics/gen-ai.yaml Outdated Show resolved Hide resolved

lmolkova reviewed Apr 25, 2024

View reviewed changes

model/metrics/gen-ai.yaml Outdated Show resolved Hide resolved

lmolkova reviewed Apr 25, 2024

View reviewed changes

model/registry/gen-ai.yaml Outdated Show resolved Hide resolved

lmolkova reviewed Apr 25, 2024

View reviewed changes

model/metrics/gen-ai.yaml Outdated Show resolved Hide resolved

lmolkova reviewed Apr 25, 2024

View reviewed changes

model/metrics/gen-ai.yaml Show resolved Hide resolved

lmolkova reviewed Apr 25, 2024

View reviewed changes

model/metrics/gen-ai.yaml Outdated Show resolved Hide resolved

lmolkova reviewed Apr 25, 2024

View reviewed changes

docs/gen-ai/llm-metrics.md Outdated Show resolved Hide resolved

cartermp reviewed Apr 25, 2024

View reviewed changes

drewby commented Apr 26, 2024

View reviewed changes

docs/gen-ai/gen-ai-metrics.md Show resolved Hide resolved

nirga reviewed Apr 26, 2024

View reviewed changes

model/registry/gen-ai.yaml Outdated Show resolved Hide resolved

lmolkova reviewed Apr 26, 2024

View reviewed changes

joaopgrassi added the area:gen-ai label Apr 30, 2024