Introduce per-message structured GenAI events instead of prompt/completion span events #980

lmolkova · 2024-04-29T18:22:32Z

Fixes #834

Changes

Breaks down prompt and completion events to system, user, tool, assistant events as well as the choice event
Specifies their structure
Switches them to be log-based events instead of span events

Related to #954, #829

Merge requirement checklist

CONTRIBUTING.md guidelines followed.
Change log entry added, according to the guidelines in When to add a changelog entry.
- If your PR does not need a change log, start the PR title with [chore]
schema-next.yaml updated with changes to existing conventions - N/A

docs/gen-ai/llm-spans.md

drewby

I think most, if not all, conversations are resolved. There are some branch conflicts and and build issues. Once those are resolved, I think we can merge this.

github-actions · 2024-07-05T03:20:14Z

This PR was marked stale due to lack of activity. It will be closed in 7 days.

model/registry/gen-ai.yaml

codefromthecrypt

I'm in favor of progress here, though I might peel off the unrelated change into an easier to merge PR (finish_reason part). My main concern is around events and adoptability.

We are defining a span attribute gen_ai.event.content for a potentially large and sensitive value. I can tell this is compensating for missing language features, but it also seems a reason to push for those to complete. Particularly, there are two top languages today: python and javascript. It is hard to reason with why a small api like event is too much to implement. Having it conditional causes this tech debt and would be better to get rid of that. If it is conditional, I would consider raising a tracking issue and then citing that in the docs, so it is easy to tell when the special case can be removed.

Next is adoptability: even the prior wasn't converged and also I've had feedback that feels like some tools will not change the approach they are using today. Especially from SIG members, but not just SIG members, I would like to see who controls implementations and would actually transition to this format once merged.

If no one will actually use what's defined, and we feel they are compatible even if they don't, then maybe we should consider removing this prompt/completion recording vs porting it to structured events. This piece is already security and operationally sensitive, so we should make sure the effort to maintain the specs around it is worth it (in terms of use).

copying in directly @nirga (openllmetry) @patcher9 (openlit) @karthikscale3 (langtrace) as they are a few implementors who would have to change their code.

docs/gen-ai/llm-spans.md

docs/attributes-registry/gen-ai.md

drewby

I think most feedback is addressed. Unless there is any final critical items I think we should merge this and allow further feedback/updates via new Issues/PRs.

codefromthecrypt · 2024-07-24T08:56:03Z

there's a semantic release pending (1.27.0) and big q on my mind is if this is blocking the release or not?

For me, I'm not a maintainer so can't make blocking feedback on this PR, but #980 (comment) is still curious and I'm not sure if it is a bug/fuzz or not.

'gen_ai.event.content' moves back into span attribute size/sensitivity and also doesn't indicate what this is about (prompt or completion). It seems to imply it is the input content, so not sure why if this is intentional it isn't named as such.

docs/gen-ai/gen-ai-events.md

karthikscale3 · 2024-09-16T05:03:19Z

Looks like it's going to be Logs API for emitting Events going forward and they are recommending interoperability with other log sinks. Should we note Logs API instead of Event API?
https://github.com/open-telemetry/oteps/pull/265/files

lmolkova · 2024-09-25T00:24:37Z

Looks like it's going to be Logs API for emitting Events going forward and they are recommending interoperability with other log sinks. Should we note Logs API instead of Event API? https://github.com/open-telemetry/oteps/pull/265/files

OTel events will remain events, but will likely be emitted by Logger instead of EventLogger class. It does not change anything for semantic conventions or this PR.

docs/gen-ai/gen-ai-events.md

nirga

🎉

karthikscale3

👏🏽

docs/gen-ai/gen-ai-events.md

codefromthecrypt

FWIW: tested internally with js and soon python

Internal testing at ES had enableContentCapture to capture events or not (regardless of the semantic) and choice of mechanism via eventsSemconv (log vs span or default to whatever semver implies)

…etion span events (open-telemetry#980)

lmolkova requested review from a team April 29, 2024 18:22

github-actions bot assigned jsuereth Apr 29, 2024

This was referenced Apr 29, 2024

OTel semconv: First stab at events traceloop/semantic-conventions#3

Closed

Define span events to events mapping open-telemetry/opentelemetry-specification#4023

Closed

Sensitive Data Redaction open-telemetry/oteps#255

Closed

TaoChenOSU reviewed May 1, 2024

View reviewed changes