Skip to content

Commit

Permalink
[LLM tasks] Add product documentation retrieval task (elastic#194379)
Browse files Browse the repository at this point in the history
## Summary

Close elastic#193473
Close elastic#193474

This PR utilize the documentation packages that are build via the tool
introduced by elastic#193847, allowing to
install them in Kibana and expose documentation retrieval as an LLM task
that AI assistants (or other consumers) can call.

Users can now decide to install the Elastic documentation from the
assistant's config screen, which will expose a new tool for the
assistant, `retrieve_documentation` (only implemented for the o11y
assistant in the current PR, shall be done for security as a follow up).

For more information, please refer to the self-review.

## General architecture

<img width="1118" alt="Screenshot 2024-10-17 at 09 22 32"
src="https://github.com/user-attachments/assets/3df8c30a-9ccc-49ab-92ce-c204b96d6fc4">

## What this PR does

Adds two plugin:
- `productDocBase`: contains all the logic related to product
documentation installation, status, and search. This is meant to be a
"low level" components only responsible for this specific part.
- `llmTasks`: an higher level plugin that will contain various LLM tasks
to be used by assistants and genAI consumers. The intent is not to have
a single place to put all llm tasks, but more to have a default place
where we can introduce new tasks from. (fwiw, the `nlToEsql` task will
probably be moved to that plugin).

- Add a `retrieve_documentation` tool registration for the o11y
assistant
- Add a component on the o11y assistant configuration page to install
the product doc

(wiring the feature to the o11y assistant was done for testing purposes
mostly, any addition / changes / enhancement should be done by the
owning team - either in this PR or as a follow-up)

## What is NOT included in this PR:

- Wire product base feature to the security assistant (should be done by
the owning team as a follow-up)
  - installation
  - utilization as tool

- FTR tests: this is somewhat blocked by the same things we need to
figure out for elastic/kibana-team#1271

## Screenshots

### Installation from o11y assistant configuration page

<img width="1476" alt="Screenshot 2024-10-17 at 09 41 24"
src="https://github.com/user-attachments/assets/31daa585-9fb2-400a-a2d1-5917a262367a">

### Example of output

#### Without product documentation installed

<img width="739" alt="Screenshot 2024-10-10 at 09 59 41"
src="https://github.com/user-attachments/assets/993fb216-6c9a-433f-bf44-f6e383d20d9d">

#### With product documentation installed

<img width="718" alt="Screenshot 2024-10-10 at 09 55 38"
src="https://github.com/user-attachments/assets/805ea4ca-8bc9-4355-a434-0ba81f8228a9">

---------

Co-authored-by: kibanamachine <[email protected]>
Co-authored-by: Alex Szabo <[email protected]>
Co-authored-by: Matthias Wilhelm <[email protected]>
Co-authored-by: Elastic Machine <[email protected]>
(cherry picked from commit 455c781)

# Conflicts:
#	.github/CODEOWNERS
  • Loading branch information
pgayvallet committed Nov 19, 2024
1 parent 43fa8a5 commit dc865f8
Show file tree
Hide file tree
Showing 149 changed files with 5,659 additions and 64 deletions.
8 changes: 8 additions & 0 deletions docs/developer/plugin-list.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -690,6 +690,10 @@ the infrastructure monitoring use-case within Kibana.
using the CURL scripts in the scripts folder.
|{kib-repo}blob/{branch}/x-pack/plugins/ai_infra/llm_tasks/README.md[llmTasks]
|This plugin contains various LLM tasks.
|{kib-repo}blob/{branch}/x-pack/plugins/observability_solution/logs_data_access/README.md[logsDataAccess]
|Exposes services to access logs data.
Expand Down Expand Up @@ -767,6 +771,10 @@ Elastic.
|This plugin helps users learn how to use the Painless scripting language.
|{kib-repo}blob/{branch}/x-pack/plugins/ai_infra/product_doc_base/README.md[productDocBase]
|This plugin contains the product documentation base service.
|{kib-repo}blob/{branch}/x-pack/plugins/observability_solution/profiling/README.md[profiling]
|Universal Profiling provides fleet-wide, whole-system, continuous profiling with zero instrumentation. Get a comprehensive understanding of what lines of code are consuming compute resources throughout your entire fleet by visualizing your data in Kibana using the flamegraph, stacktraces, and top functions views.
Expand Down
9 changes: 9 additions & 0 deletions docs/user/security/audit-logging.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -148,6 +148,9 @@ Refer to the corresponding {es} logs for potential write errors.
| `success` | Creating trained model.
| `failure` | Failed to create trained model.

.1+| `product_documentation_create`
| `unknown` | User requested to install the product documentation for use in AI Assistants.

3+a|
====== Type: change

Expand Down Expand Up @@ -334,6 +337,9 @@ Refer to the corresponding {es} logs for potential write errors.
| `success` | Updating trained model deployment.
| `failure` | Failed to update trained model deployment.

.1+| `product_documentation_update`
| `unknown` | User requested to update the product documentation for use in AI Assistants.

3+a|
====== Type: deletion

Expand Down Expand Up @@ -425,6 +431,9 @@ Refer to the corresponding {es} logs for potential write errors.
| `success` | Deleting trained model.
| `failure` | Failed to delete trained model.

.1+| `product_documentation_delete`
| `unknown` | User requested to delete the product documentation for use in AI Assistants.

3+a|
====== Type: access

Expand Down
3 changes: 3 additions & 0 deletions package.json
Original file line number Diff line number Diff line change
Expand Up @@ -616,6 +616,7 @@
"@kbn/licensing-plugin": "link:x-pack/plugins/licensing",
"@kbn/links-plugin": "link:src/plugins/links",
"@kbn/lists-plugin": "link:x-pack/plugins/lists",
"@kbn/llm-tasks-plugin": "link:x-pack/plugins/ai_infra/llm_tasks",
"@kbn/locator-examples-plugin": "link:examples/locator_examples",
"@kbn/locator-explorer-plugin": "link:examples/locator_explorer",
"@kbn/logging": "link:packages/kbn-logging",
Expand Down Expand Up @@ -718,6 +719,8 @@
"@kbn/presentation-panel-plugin": "link:src/plugins/presentation_panel",
"@kbn/presentation-publishing": "link:packages/presentation/presentation_publishing",
"@kbn/presentation-util-plugin": "link:src/plugins/presentation_util",
"@kbn/product-doc-base-plugin": "link:x-pack/plugins/ai_infra/product_doc_base",
"@kbn/product-doc-common": "link:x-pack/packages/ai-infra/product-doc-common",
"@kbn/profiling-data-access-plugin": "link:x-pack/plugins/observability_solution/profiling_data_access",
"@kbn/profiling-plugin": "link:x-pack/plugins/observability_solution/profiling",
"@kbn/profiling-utils": "link:packages/kbn-profiling-utils",
Expand Down
7 changes: 7 additions & 0 deletions packages/kbn-check-mappings-update-cli/current_fields.json
Original file line number Diff line number Diff line change
Expand Up @@ -855,6 +855,13 @@
"policy-settings-protection-updates-note": [
"note"
],
"product-doc-install-status": [
"index_name",
"installation_status",
"last_installation_date",
"product_name",
"product_version"
],
"query": [
"description",
"title",
Expand Down
20 changes: 20 additions & 0 deletions packages/kbn-check-mappings-update-cli/current_mappings.json
Original file line number Diff line number Diff line change
Expand Up @@ -2841,6 +2841,26 @@
}
}
},
"product-doc-install-status": {
"dynamic": false,
"properties": {
"index_name": {
"type": "keyword"
},
"installation_status": {
"type": "keyword"
},
"last_installation_date": {
"type": "date"
},
"product_name": {
"type": "keyword"
},
"product_version": {
"type": "keyword"
}
}
},
"query": {
"dynamic": false,
"properties": {
Expand Down
1 change: 1 addition & 0 deletions packages/kbn-optimizer/limits.yml
Original file line number Diff line number Diff line change
Expand Up @@ -124,6 +124,7 @@ pageLoadAssetSize:
painlessLab: 179748
presentationPanel: 55463
presentationUtil: 58834
productDocBase: 22500
profiling: 36694
remoteClusters: 51327
reporting: 58600
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -145,6 +145,7 @@ describe('checking migration metadata changes on all registered SO types', () =>
"osquery-pack-asset": "cd140bc2e4b092e93692b587bf6e38051ef94c75",
"osquery-saved-query": "6095e288750aa3164dfe186c74bc5195c2bf2bd4",
"policy-settings-protection-updates-note": "33924bb246f9e5bcb876109cc83e3c7a28308352",
"product-doc-install-status": "ca6e96840228e4cc2f11bae24a0797f4f7238c8c",
"query": "501bece68f26fe561286a488eabb1a8ab12f1137",
"risk-engine-configuration": "aea0c371a462e6d07c3ceb3aff11891b47feb09d",
"rules-settings": "ba57ef1881b3dcbf48fbfb28902d8f74442190b2",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -115,6 +115,7 @@ const previouslyRegisteredTypes = [
'osquery-usage-metric',
'osquery-manager-usage-metric',
'policy-settings-protection-updates-note',
'product-doc-install-status',
'query',
'rules-settings',
'sample-data-telemetry',
Expand Down
6 changes: 6 additions & 0 deletions tsconfig.base.json
Original file line number Diff line number Diff line change
Expand Up @@ -1146,6 +1146,8 @@
"@kbn/lint-ts-projects-cli/*": ["packages/kbn-lint-ts-projects-cli/*"],
"@kbn/lists-plugin": ["x-pack/plugins/lists"],
"@kbn/lists-plugin/*": ["x-pack/plugins/lists/*"],
"@kbn/llm-tasks-plugin": ["x-pack/plugins/ai_infra/llm_tasks"],
"@kbn/llm-tasks-plugin/*": ["x-pack/plugins/ai_infra/llm_tasks/*"],
"@kbn/locator-examples-plugin": ["examples/locator_examples"],
"@kbn/locator-examples-plugin/*": ["examples/locator_examples/*"],
"@kbn/locator-explorer-plugin": ["examples/locator_explorer"],
Expand Down Expand Up @@ -1384,6 +1386,10 @@
"@kbn/presentation-util-plugin/*": ["src/plugins/presentation_util/*"],
"@kbn/product-doc-artifact-builder": ["x-pack/packages/ai-infra/product-doc-artifact-builder"],
"@kbn/product-doc-artifact-builder/*": ["x-pack/packages/ai-infra/product-doc-artifact-builder/*"],
"@kbn/product-doc-base-plugin": ["x-pack/plugins/ai_infra/product_doc_base"],
"@kbn/product-doc-base-plugin/*": ["x-pack/plugins/ai_infra/product_doc_base/*"],
"@kbn/product-doc-common": ["x-pack/packages/ai-infra/product-doc-common"],
"@kbn/product-doc-common/*": ["x-pack/packages/ai-infra/product-doc-common/*"],
"@kbn/profiling-data-access-plugin": ["x-pack/plugins/observability_solution/profiling_data_access"],
"@kbn/profiling-data-access-plugin/*": ["x-pack/plugins/observability_solution/profiling_data_access/*"],
"@kbn/profiling-plugin": ["x-pack/plugins/observability_solution/profiling"],
Expand Down
48 changes: 47 additions & 1 deletion x-pack/packages/ai-infra/product-doc-artifact-builder/README.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,49 @@
# @kbn/product-doc-artifact-builder

Script to build the knowledge base artifacts
Script to build the knowledge base artifacts.

## How to run

```
node scripts/build_product_doc_artifacts.js --stack-version {version} --product-name {product}
```

### parameters

#### `stack-version`:

the stack version to generate the artifacts for.

#### `product-name`:

(multi-value) the list of products to generate artifacts for.

possible values:
- "kibana"
- "elasticsearch"
- "observability"
- "security"

#### `target-folder`:

The folder to generate the artifacts in.

Defaults to `{REPO_ROOT}/build-kb-artifacts`.

#### `build-folder`:

The folder to use for temporary files.

Defaults to `{REPO_ROOT}/build/temp-kb-artifacts`

#### Cluster infos

- params for the source cluster:
`sourceClusterUrl` / env.KIBANA_SOURCE_CLUSTER_URL
`sourceClusterUsername` / env.KIBANA_SOURCE_CLUSTER_USERNAME
`sourceClusterPassword` / env.KIBANA_SOURCE_CLUSTER_PASSWORD

- params for the embedding cluster:
`embeddingClusterUrl` / env.KIBANA_EMBEDDING_CLUSTER_URL
`embeddingClusterUsername` / env.KIBANA_EMBEDDING_CLUSTER_USERNAME
`embeddingClusterPassword` / env.KIBANA_EMBEDDING_CLUSTER_PASSWORD
Original file line number Diff line number Diff line change
Expand Up @@ -5,17 +5,13 @@
* 2.0.
*/

export interface ArtifactManifest {
formatVersion: string;
productName: string;
productVersion: string;
}
import type { ArtifactManifest, ProductName } from '@kbn/product-doc-common';

export const getArtifactManifest = ({
productName,
stackVersion,
}: {
productName: string;
productName: ProductName;
stackVersion: string;
}): ArtifactManifest => {
return {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -21,10 +21,7 @@ export const getArtifactMappings = (inferenceEndpoint: string): MappingTypeMappi
slug: { type: 'keyword' },
url: { type: 'keyword' },
version: { type: 'version' },
ai_subtitle: {
type: 'semantic_text',
inference_id: inferenceEndpoint,
},
ai_subtitle: { type: 'text' },
ai_summary: {
type: 'semantic_text',
inference_id: inferenceEndpoint,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,34 @@
* 2.0.
*/

/**
* The allowed product names, as found in the source's cluster
*/
export const sourceProductNames = ['Kibana', 'Elasticsearch', 'Security', 'Observability'];
import type { ProductName } from '@kbn/product-doc-common';

const productNameToSourceNamesMap: Record<ProductName, string[]> = {
kibana: ['Kibana'],
elasticsearch: ['Elasticsearch'],
security: ['Security'],
observability: ['Observability'],
};

const sourceNameToProductName = Object.entries(productNameToSourceNamesMap).reduce<
Record<string, ProductName>
>((map, [productName, sourceNames]) => {
sourceNames.forEach((sourceName) => {
map[sourceName] = productName as ProductName;
});
return map;
}, {});

export const getSourceNamesFromProductName = (productName: ProductName): string[] => {
if (!productNameToSourceNamesMap[productName]) {
throw new Error(`Unknown product name: ${productName}`);
}
return productNameToSourceNamesMap[productName];
};

export const getProductNameFromSource = (source: string): ProductName => {
if (!sourceNameToProductName[source]) {
throw new Error(`Unknown source name: ${source}`);
}
return sourceNameToProductName[source];
};
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
import Path from 'path';
import { Client } from '@elastic/elasticsearch';
import { ToolingLog } from '@kbn/tooling-log';
import type { ProductName } from '@kbn/product-doc-common';
import {
// checkConnectivity,
createTargetIndex,
Expand All @@ -18,6 +19,7 @@ import {
createArtifact,
cleanupFolders,
deleteIndex,
processDocuments,
} from './tasks';
import type { TaskConfig } from './types';

Expand Down Expand Up @@ -93,7 +95,7 @@ const buildArtifact = async ({
sourceClient,
log,
}: {
productName: string;
productName: ProductName;
stackVersion: string;
buildFolder: string;
targetFolder: string;
Expand All @@ -105,14 +107,16 @@ const buildArtifact = async ({

const targetIndex = getTargetIndexName({ productName, stackVersion });

const documents = await extractDocumentation({
let documents = await extractDocumentation({
client: sourceClient,
index: 'search-docs-1',
log,
productName,
stackVersion,
});

documents = await processDocuments({ documents, log });

await createTargetIndex({
client: embeddingClient,
indexName: targetIndex,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,19 +6,19 @@
*/

import Path from 'path';
import { REPO_ROOT } from '@kbn/repo-info';
import yargs from 'yargs';
import { REPO_ROOT } from '@kbn/repo-info';
import { DocumentationProduct } from '@kbn/product-doc-common';
import type { TaskConfig } from './types';
import { buildArtifacts } from './build_artifacts';
import { sourceProductNames } from './artifact/product_name';

function options(y: yargs.Argv) {
return y
.option('productName', {
describe: 'name of products to generate documentation for',
array: true,
choices: sourceProductNames,
default: ['Kibana'],
choices: Object.values(DocumentationProduct),
default: [DocumentationProduct.kibana],
})
.option('stackVersion', {
describe: 'The stack version to generate documentation for',
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,9 +8,9 @@
import Path from 'path';
import AdmZip from 'adm-zip';
import type { ToolingLog } from '@kbn/tooling-log';
import { getArtifactName, type ProductName } from '@kbn/product-doc-common';
import { getArtifactMappings } from '../artifact/mappings';
import { getArtifactManifest } from '../artifact/manifest';
import { getArtifactName } from '../artifact/artifact_name';

export const createArtifact = async ({
productName,
Expand All @@ -21,7 +21,7 @@ export const createArtifact = async ({
}: {
buildFolder: string;
targetFolder: string;
productName: string;
productName: ProductName;
stackVersion: string;
log: ToolingLog;
}) => {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ import Fs from 'fs/promises';
import type { Client } from '@elastic/elasticsearch';
import type { ToolingLog } from '@kbn/tooling-log';

const fileSizeLimit = 250_000;
const fileSizeLimit = 500_000;

export const createChunkFiles = async ({
index,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -21,10 +21,7 @@ const mappings: MappingTypeMapping = {
slug: { type: 'keyword' },
url: { type: 'keyword' },
version: { type: 'version' },
ai_subtitle: {
type: 'semantic_text',
inference_id: 'kibana-elser2',
},
ai_subtitle: { type: 'text' },
ai_summary: {
type: 'semantic_text',
inference_id: 'kibana-elser2',
Expand Down
Loading

0 comments on commit dc865f8

Please sign in to comment.