-
Notifications
You must be signed in to change notification settings - Fork 182
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce per-message structured GenAI events instead of prompt/completion span events #980
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think most, if not all, conversations are resolved. There are some branch conflicts and and build issues. Once those are resolved, I think we can merge this.
This PR was marked stale due to lack of activity. It will be closed in 7 days. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm in favor of progress here, though I might peel off the unrelated change into an easier to merge PR (finish_reason part). My main concern is around events and adoptability.
We are defining a span attribute gen_ai.event.content
for a potentially large and sensitive value. I can tell this is compensating for missing language features, but it also seems a reason to push for those to complete. Particularly, there are two top languages today: python and javascript. It is hard to reason with why a small api like event is too much to implement. Having it conditional causes this tech debt and would be better to get rid of that. If it is conditional, I would consider raising a tracking issue and then citing that in the docs, so it is easy to tell when the special case can be removed.
Next is adoptability: even the prior wasn't converged and also I've had feedback that feels like some tools will not change the approach they are using today. Especially from SIG members, but not just SIG members, I would like to see who controls implementations and would actually transition to this format once merged.
If no one will actually use what's defined, and we feel they are compatible even if they don't, then maybe we should consider removing this prompt/completion recording vs porting it to structured events. This piece is already security and operationally sensitive, so we should make sure the effort to maintain the specs around it is worth it (in terms of use).
copying in directly @nirga (openllmetry) @patcher9 (openlit) @karthikscale3 (langtrace) as they are a few implementors who would have to change their code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think most feedback is addressed. Unless there is any final critical items I think we should merge this and allow further feedback/updates via new Issues/PRs.
there's a semantic release pending (1.27.0) and big q on my mind is if this is blocking the release or not? For me, I'm not a maintainer so can't make blocking feedback on this PR, but #980 (comment) is still curious and I'm not sure if it is a bug/fuzz or not. 'gen_ai.event.content' moves back into span attribute size/sensitivity and also doesn't indicate what this is about (prompt or completion). It seems to imply it is the input content, so not sure why if this is intentional it isn't named as such. |
54531dd
to
7f5a4df
Compare
Looks like it's going to be Logs API for emitting Events going forward and they are recommending interoperability with other log sinks. Should we note Logs API instead of Event API? |
OTel events will remain events, but will likely be emitted by |
7acc146
to
3a12546
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🎉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👏🏽
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW: tested internally with js and soon python
Internal testing at ES had enableContentCapture
to capture events or not (regardless of the semantic) and choice of mechanism via eventsSemconv
(log vs span or default to whatever semver implies)
Fixes #834
Changes
Related to #954, #829
Merge requirement checklist
[chore]