-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Observability AI Assistant] Fully migrate to inference client #197630
Comments
Pinging @elastic/obs-ai-assistant (Team:Obs AI Assistant) |
This is the
You can get one from the
After making this change, we should be able to remove all the adapters: https://github.com/elastic/kibana/tree/main/x-pack/plugins/observability_solution/observability_ai_assistant/server/service/client/adapters |
@dgieselaar Thank you for the detailed information on migrating to
kibana/x-pack/plugins/observability_solution/observability_ai_assistant/common/types.ts Lines 13 to 19 in bde5e11
kibana/x-pack/plugins/observability_solution/observability_ai_assistant/common/types.ts Lines 33 to 46 in bde5e11
with each role having specific types:
|
We have an Line 16 in 631ccb0
query function).
Not sure if I get this question, WDYM with the "intended design for handling function calls"? Just to clarify, it's just a different way of setting the flag, they use the same mechanism (the implementation in the
We can just delete them I think, with one caveat: I think we hardcode simulated function calling for Bedrock and Gemini, that should no longer be necessary - the Inference plugin has an implementation (thanks to @pgayvallet) that delegates to native function calling for both Bedrock & Gemini. |
Yeah that's correct. The inference APIs have a |
thanks @pgayvallet and @dgieselaar for the clarification. I’m following the documentation to get token count information from the streaming mode. I’ve set this is the example from the documentation: const chatResponse = inferenceClient.chatComplete({
connectorId: 'some-gen-ai-connector',
system: `Here is my system message`,
messages: [
{
role: MessageRole.User,
content: 'Do something',
},
],
});
const { content, tokens } = chatResponse;
// do something with the output Could you clarify if the token count is accessible directly in streaming mode, or should I handle it differently? If possible, an example of how to access token count within the streaming mode would be helpful. |
Additionally, I’d like to know how to properly abort a chat completion request. Is there a recommended approach for canceling an in-progress chat in streaming mode, or any specific method provided in the inference client for handling this? |
AFAIK there's no way to abort a streaming request atm. Not at the inference client's level for sure, but most importantly, not at the connector's level. We could eventually expose that at the inference client's level via some abort controller pattern, but without being able to properly cancel the streamed action at the connector's level, I'm not sure it would be very useful in practice in terms of perf / resource release gain. Still, if that's something that o11y would want, we could expose that API, even if in practice it would basically just complete the observable on call, and see later if we can wire it properly to perform "real" cancelation at the lower layers. EDIT: actually seems like the connector does support passing an abort signal kibana/x-pack/plugins/stack_connectors/server/connector_types/openai/openai.ts Lines 276 to 280 in 9372027
So we can probably leverage that |
I opened #200757 to talk about request cancelation. Insight very welcome |
… (elastic#199286) ## Summary Closes elastic#183245 Closes elastic#197630 [Observability AI Assistant] Partially migrate to inference client replacing `inferenceClient.chatComplete` to `observabilityAIAssistantClient.chat` - `observabilityAIAssistantClient.complete` does a bunch of stuff on top of `chat`. keepping `observabilityAIAssistantClient.chat` as a wrapper for now because it also adds instrumentation and logging. (cherry picked from commit df0dfa5)
#199286) (#203399) # Backport This will backport the following commits from `main` to `8.x`: - [[Observability AI Assistant] migrate to inference client #197630 (#199286)](#199286) <!--- Backport version: 9.4.3 --> ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sqren/backport) <!--BACKPORT [{"author":{"name":"Arturo Lidueña","email":"[email protected]"},"sourceCommit":{"committedDate":"2024-12-09T11:46:31Z","message":"[Observability AI Assistant] migrate to inference client #197630 (#199286)\n\n## Summary\r\n\r\nCloses https://github.com/elastic/kibana/issues/183245\r\n\r\nCloses #197630 \r\n[Observability AI Assistant] Partially migrate to inference client \r\n\r\nreplacing `inferenceClient.chatComplete` to\r\n`observabilityAIAssistantClient.chat` -\r\n`observabilityAIAssistantClient.complete` does a bunch of stuff on top\r\nof `chat`. keepping `observabilityAIAssistantClient.chat` as a wrapper\r\nfor now because it also adds instrumentation and logging.","sha":"df0dfa5216c1d1d3b72a30b3b1d903781db90615","branchLabelMapping":{"^v9.0.0$":"main","^v8.18.0$":"8.x","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","v9.0.0","Team:Obs AI Assistant","ci:project-deploy-observability","backport:version","v8.18.0"],"title":"[Observability AI Assistant] migrate to inference client #197630","number":199286,"url":"https://github.com/elastic/kibana/pull/199286","mergeCommit":{"message":"[Observability AI Assistant] migrate to inference client #197630 (#199286)\n\n## Summary\r\n\r\nCloses https://github.com/elastic/kibana/issues/183245\r\n\r\nCloses #197630 \r\n[Observability AI Assistant] Partially migrate to inference client \r\n\r\nreplacing `inferenceClient.chatComplete` to\r\n`observabilityAIAssistantClient.chat` -\r\n`observabilityAIAssistantClient.complete` does a bunch of stuff on top\r\nof `chat`. keepping `observabilityAIAssistantClient.chat` as a wrapper\r\nfor now because it also adds instrumentation and logging.","sha":"df0dfa5216c1d1d3b72a30b3b1d903781db90615"}},"sourceBranch":"main","suggestedTargetBranches":["8.x"],"targetPullRequestStates":[{"branch":"main","label":"v9.0.0","branchLabelMappingKey":"^v9.0.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/199286","number":199286,"mergeCommit":{"message":"[Observability AI Assistant] migrate to inference client #197630 (#199286)\n\n## Summary\r\n\r\nCloses https://github.com/elastic/kibana/issues/183245\r\n\r\nCloses #197630 \r\n[Observability AI Assistant] Partially migrate to inference client \r\n\r\nreplacing `inferenceClient.chatComplete` to\r\n`observabilityAIAssistantClient.chat` -\r\n`observabilityAIAssistantClient.complete` does a bunch of stuff on top\r\nof `chat`. keepping `observabilityAIAssistantClient.chat` as a wrapper\r\nfor now because it also adds instrumentation and logging.","sha":"df0dfa5216c1d1d3b72a30b3b1d903781db90615"}},{"branch":"8.x","label":"v8.18.0","branchLabelMappingKey":"^v8.18.0$","isSourceBranch":false,"state":"NOT_CREATED"}]}] BACKPORT--> Co-authored-by: Arturo Lidueña <[email protected]>
… (elastic#199286) ## Summary Closes elastic#183245 Closes elastic#197630 [Observability AI Assistant] Partially migrate to inference client replacing `inferenceClient.chatComplete` to `observabilityAIAssistantClient.chat` - `observabilityAIAssistantClient.complete` does a bunch of stuff on top of `chat`. keepping `observabilityAIAssistantClient.chat` as a wrapper for now because it also adds instrumentation and logging.
… (elastic#199286) ## Summary Closes elastic#183245 Closes elastic#197630 [Observability AI Assistant] Partially migrate to inference client replacing `inferenceClient.chatComplete` to `observabilityAIAssistantClient.chat` - `observabilityAIAssistantClient.complete` does a bunch of stuff on top of `chat`. keepping `observabilityAIAssistantClient.chat` as a wrapper for now because it also adds instrumentation and logging.
… (elastic#199286) ## Summary Closes elastic#183245 Closes elastic#197630 [Observability AI Assistant] Partially migrate to inference client replacing `inferenceClient.chatComplete` to `observabilityAIAssistantClient.chat` - `observabilityAIAssistantClient.complete` does a bunch of stuff on top of `chat`. keepping `observabilityAIAssistantClient.chat` as a wrapper for now because it also adds instrumentation and logging.
… (elastic#199286) ## Summary Closes elastic#183245 Closes elastic#197630 [Observability AI Assistant] Partially migrate to inference client replacing `inferenceClient.chatComplete` to `observabilityAIAssistantClient.chat` - `observabilityAIAssistantClient.complete` does a bunch of stuff on top of `chat`. keepping `observabilityAIAssistantClient.chat` as a wrapper for now because it also adds instrumentation and logging.
We currently use the inference client in the NL-to-ESQL task. We should fully migrate to it, which means that we replace all instances of
client.chatComplete()
andclient.chat()
withinferenceClient.chatComplete()
andinferenceClient.output()
, as this would mean less maintenance for us, and we have a single place in Kibana where we handle and can improve LLM interactions.Dependencies
The text was updated successfully, but these errors were encountered: