forked from elastic/kibana
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[RCA] AI-assisted root cause analysis (elastic#197200)
Implements an LLM-based root cause analysis process. At a high level, it works by investigating entities - which means pulling in alerts, SLOs, and log patterns. From there, it can inspect related entities to get to the root cause. The backend implementation lives in `x-pack/packages/observability_utils-*` (`service_rca`). It can be imported into any server-side plugin and executed from there. The UI changes are mostly contained to `x-pack/plugins/observability_solution/observabillity_ai_assistant_app`. This plugin now exports a `RootCauseAnalysisContainer` which takes a stream of data that is returned by the root cause analysis process. The current implementation lives in the Investigate app. There, it calls its own endpoint that kicks off the RCA process, and feeds it into the `RootCauseAnalysisContainer` exposed by the Observability AI Assistant app plugin. I've left it in a route there so the investigation itself can be updated as the process runs - this would allow the user to close the browser and come back later, and see a full investigation. > [!NOTE] > Notes for reviewing teams > > @kbn/es-types: > - support both types and typesWithBodyKey > - simplify KeysOfSources type > > @kbn/server-route-repository: > - abortable streamed responses > > @kbn/sse-utils*: > - abortable streamed responses > - serialize errors in specific format for more reliable re-hydration of errors > - keep connection open with SSE comments > > @kbn/inference-*: > - export *Of variants of types, for easier manual inference > - add automated retries for `output` API > - add `name` to tool responses for type inference (get type of tool response via tool name) > - add `data` to tool responses for transporting internal data (not sent to the LLM) > - simplify `chunksIntoMessage` > - allow consumers of nlToEsql task to add to `system` prompt > - add toolCallId to validation error message > > @kbn/aiops*: > - export `categorizationAnalyzer` for use in observability-ai* > > @kbn/observability-ai-assistant* > - configurable limit (tokens or doc count) for knowledge base recall > > @kbn/slo*: > - export client that returns summary indices --------- Co-authored-by: kibanamachine <[email protected]> Co-authored-by: Maryam Saeidi <[email protected]> Co-authored-by: Bena Kansara <[email protected]>
- Loading branch information
1 parent
64e9728
commit fa1998c
Showing
144 changed files
with
27,293 additions
and
364 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
198 changes: 198 additions & 0 deletions
198
packages/kbn-sse-utils-server/src/observable_into_event_source_stream.test.ts
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,198 @@ | ||
/* | ||
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one | ||
* or more contributor license agreements. Licensed under the "Elastic License | ||
* 2.0", the "GNU Affero General Public License v3.0 only", and the "Server Side | ||
* Public License v 1"; you may not use this file except in compliance with, at | ||
* your election, the "Elastic License 2.0", the "GNU Affero General Public | ||
* License v3.0 only", or the "Server Side Public License, v 1". | ||
*/ | ||
|
||
import { Logger } from '@kbn/logging'; | ||
import { observableIntoEventSourceStream } from './observable_into_event_source_stream'; | ||
import { PassThrough } from 'node:stream'; | ||
import { Subject } from 'rxjs'; | ||
import { ServerSentEvent, ServerSentEventType } from '@kbn/sse-utils/src/events'; | ||
import { | ||
ServerSentEventErrorCode, | ||
createSSEInternalError, | ||
createSSERequestError, | ||
} from '@kbn/sse-utils/src/errors'; | ||
|
||
describe('observableIntoEventSourceStream', () => { | ||
let logger: jest.Mocked<Logger>; | ||
|
||
let controller: AbortController; | ||
|
||
let stream: PassThrough; | ||
let source$: Subject<ServerSentEvent>; | ||
|
||
let data: string[]; | ||
|
||
beforeEach(() => { | ||
jest.useFakeTimers(); | ||
logger = { | ||
debug: jest.fn(), | ||
error: jest.fn(), | ||
} as unknown as jest.Mocked<Logger>; | ||
|
||
controller = new AbortController(); | ||
source$ = new Subject(); | ||
data = []; | ||
|
||
stream = observableIntoEventSourceStream(source$, { logger, signal: controller.signal }); | ||
stream.on('data', (chunk) => { | ||
data.push(chunk.toString()); | ||
}); | ||
}); | ||
|
||
afterEach(() => { | ||
jest.clearAllTimers(); | ||
}); | ||
|
||
it('writes events into the stream in SSE format', () => { | ||
source$.next({ type: ServerSentEventType.data, data: { foo: 'bar' } }); | ||
source$.complete(); | ||
|
||
jest.runAllTimers(); | ||
|
||
expect(data).toEqual(['event: data\ndata: {"data":{"foo":"bar"}}\n\n']); | ||
}); | ||
|
||
it('handles SSE errors', () => { | ||
const sseError = createSSEInternalError('Invalid input'); | ||
|
||
source$.error(sseError); | ||
|
||
jest.runAllTimers(); | ||
|
||
expect(logger.error).toHaveBeenCalledWith(sseError); | ||
expect(logger.debug).toHaveBeenCalled(); | ||
const debugFn = logger.debug.mock.calls[0][0] as () => string; | ||
const loggedError = JSON.parse(debugFn()); | ||
expect(loggedError).toEqual({ | ||
type: 'error', | ||
error: { | ||
code: ServerSentEventErrorCode.internalError, | ||
message: 'Invalid input', | ||
meta: {}, | ||
}, | ||
}); | ||
|
||
expect(data).toEqual([ | ||
`event: error\ndata: ${JSON.stringify({ | ||
error: { | ||
code: ServerSentEventErrorCode.internalError, | ||
message: 'Invalid input', | ||
meta: {}, | ||
}, | ||
})}\n\n`, | ||
]); | ||
}); | ||
|
||
it('handles SSE errors with metadata', () => { | ||
const sseError = createSSERequestError('Invalid request', 400); | ||
|
||
source$.error(sseError); | ||
|
||
jest.runAllTimers(); | ||
|
||
expect(logger.error).toHaveBeenCalledWith(sseError); | ||
expect(logger.debug).toHaveBeenCalled(); | ||
const debugFn = logger.debug.mock.calls[0][0] as () => string; | ||
const loggedError = JSON.parse(debugFn()); | ||
expect(loggedError).toEqual({ | ||
type: 'error', | ||
error: { | ||
code: ServerSentEventErrorCode.requestError, | ||
message: 'Invalid request', | ||
meta: { | ||
status: 400, | ||
}, | ||
}, | ||
}); | ||
|
||
expect(data).toEqual([ | ||
`event: error\ndata: ${JSON.stringify({ | ||
error: { | ||
code: ServerSentEventErrorCode.requestError, | ||
message: 'Invalid request', | ||
meta: { | ||
status: 400, | ||
}, | ||
}, | ||
})}\n\n`, | ||
]); | ||
}); | ||
|
||
it('handles non-SSE errors', () => { | ||
const error = new Error('Non-SSE Error'); | ||
|
||
source$.error(error); | ||
|
||
jest.runAllTimers(); | ||
|
||
expect(logger.error).toHaveBeenCalledWith(error); | ||
expect(data).toEqual([ | ||
`event: error\ndata: ${JSON.stringify({ | ||
error: { | ||
code: ServerSentEventErrorCode.internalError, | ||
message: 'Non-SSE Error', | ||
}, | ||
})}\n\n`, | ||
]); | ||
}); | ||
|
||
it('should send keep-alive comments every 10 seconds', () => { | ||
jest.advanceTimersByTime(10000); | ||
expect(data).toContain(': keep-alive'); | ||
|
||
jest.advanceTimersByTime(10000); | ||
expect(data.filter((d) => d === ': keep-alive')).toHaveLength(2); | ||
}); | ||
|
||
describe('without fake timers', () => { | ||
beforeEach(() => { | ||
jest.useFakeTimers({ doNotFake: ['nextTick'] }); | ||
}); | ||
|
||
it('should end the stream when the observable completes', async () => { | ||
jest.useFakeTimers({ doNotFake: ['nextTick'] }); | ||
|
||
const endSpy = jest.fn(); | ||
stream.on('end', endSpy); | ||
|
||
source$.complete(); | ||
|
||
await new Promise((resolve) => process.nextTick(resolve)); | ||
|
||
expect(endSpy).toHaveBeenCalled(); | ||
}); | ||
|
||
it('should end stream when signal is aborted', async () => { | ||
const endSpy = jest.fn(); | ||
stream.on('end', endSpy); | ||
|
||
// Emit some data | ||
source$.next({ type: ServerSentEventType.data, data: { initial: 'data' } }); | ||
|
||
// Abort the signal | ||
controller.abort(); | ||
|
||
// Emit more data after abort | ||
source$.next({ type: ServerSentEventType.data, data: { after: 'abort' } }); | ||
|
||
await new Promise((resolve) => process.nextTick(resolve)); | ||
|
||
expect(endSpy).toHaveBeenCalled(); | ||
|
||
// Data after abort should not be received | ||
expect(data).toEqual([ | ||
`event: data\ndata: ${JSON.stringify({ data: { initial: 'data' } })}\n\n`, | ||
]); | ||
}); | ||
|
||
afterEach(() => { | ||
jest.useFakeTimers(); | ||
}); | ||
}); | ||
}); |
Oops, something went wrong.