Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce per-message structured GenAI events instead of prompt/completion span events #980

Merged
merged 10 commits into from
Oct 4, 2024

Conversation

lmolkova
Copy link
Contributor

@lmolkova lmolkova commented Apr 29, 2024

Fixes #834

Changes

  • Breaks down prompt and completion events to system, user, tool, assistant events as well as the choice event
  • Specifies their structure
  • Switches them to be log-based events instead of span events

Related to #954, #829

Merge requirement checklist

docs/gen-ai/llm-spans.md Outdated Show resolved Hide resolved
docs/gen-ai/llm-spans.md Outdated Show resolved Hide resolved
docs/gen-ai/llm-spans.md Outdated Show resolved Hide resolved
docs/gen-ai/llm-spans.md Outdated Show resolved Hide resolved
docs/gen-ai/llm-spans.md Outdated Show resolved Hide resolved
@lmolkova lmolkova requested a review from a team May 8, 2024 05:26
docs/gen-ai/llm-spans.md Outdated Show resolved Hide resolved
Copy link
Member

@drewby drewby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think most, if not all, conversations are resolved. There are some branch conflicts and and build issues. Once those are resolved, I think we can merge this.

Copy link

github-actions bot commented Jul 5, 2024

This PR was marked stale due to lack of activity. It will be closed in 7 days.

Copy link

@codefromthecrypt codefromthecrypt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm in favor of progress here, though I might peel off the unrelated change into an easier to merge PR (finish_reason part). My main concern is around events and adoptability.

We are defining a span attribute gen_ai.event.content for a potentially large and sensitive value. I can tell this is compensating for missing language features, but it also seems a reason to push for those to complete. Particularly, there are two top languages today: python and javascript. It is hard to reason with why a small api like event is too much to implement. Having it conditional causes this tech debt and would be better to get rid of that. If it is conditional, I would consider raising a tracking issue and then citing that in the docs, so it is easy to tell when the special case can be removed.

Next is adoptability: even the prior wasn't converged and also I've had feedback that feels like some tools will not change the approach they are using today. Especially from SIG members, but not just SIG members, I would like to see who controls implementations and would actually transition to this format once merged.

If no one will actually use what's defined, and we feel they are compatible even if they don't, then maybe we should consider removing this prompt/completion recording vs porting it to structured events. This piece is already security and operationally sensitive, so we should make sure the effort to maintain the specs around it is worth it (in terms of use).

copying in directly @nirga (openllmetry) @patcher9 (openlit) @karthikscale3 (langtrace) as they are a few implementors who would have to change their code.

docs/gen-ai/llm-spans.md Outdated Show resolved Hide resolved
docs/attributes-registry/gen-ai.md Outdated Show resolved Hide resolved
Copy link
Member

@drewby drewby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think most feedback is addressed. Unless there is any final critical items I think we should merge this and allow further feedback/updates via new Issues/PRs.

@codefromthecrypt
Copy link

codefromthecrypt commented Jul 24, 2024

there's a semantic release pending (1.27.0) and big q on my mind is if this is blocking the release or not?

For me, I'm not a maintainer so can't make blocking feedback on this PR, but #980 (comment) is still curious and I'm not sure if it is a bug/fuzz or not.

'gen_ai.event.content' moves back into span attribute size/sensitivity and also doesn't indicate what this is about (prompt or completion). It seems to imply it is the input content, so not sure why if this is intentional it isn't named as such.

docs/gen-ai/gen-ai-events.md Outdated Show resolved Hide resolved
@karthikscale3
Copy link
Contributor

Looks like it's going to be Logs API for emitting Events going forward and they are recommending interoperability with other log sinks. Should we note Logs API instead of Event API?
https://github.com/open-telemetry/oteps/pull/265/files

@lmolkova
Copy link
Contributor Author

lmolkova commented Sep 25, 2024

Looks like it's going to be Logs API for emitting Events going forward and they are recommending interoperability with other log sinks. Should we note Logs API instead of Event API? https://github.com/open-telemetry/oteps/pull/265/files

OTel events will remain events, but will likely be emitted by Logger instead of EventLogger class. It does not change anything for semantic conventions or this PR.

@lmolkova lmolkova requested review from a team as code owners September 25, 2024 00:30
Copy link
Contributor

@nirga nirga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉

Copy link
Contributor

@karthikscale3 karthikscale3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👏🏽

Copy link

@codefromthecrypt codefromthecrypt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW: tested internally with js and soon python

Internal testing at ES had enableContentCapture to capture events or not (regardless of the semantic) and choice of mechanism via eventsSemconv (log vs span or default to whatever semver implies)

@lmolkova lmolkova merged commit 32b75a8 into open-telemetry:main Oct 4, 2024
14 checks passed
ChrsMark pushed a commit to ChrsMark/semantic-conventions that referenced this pull request Oct 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

LLM: define common/system-specific event structure