-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Security Solution] [Elastic AI Assistant] Data anonymization #159857
[Security Solution] [Elastic AI Assistant] Data anonymization #159857
Conversation
Pinging @elastic/security-solution (Team: SecuritySolution) |
Files by Code Ownerelastic/security-threat-hunting-investigations
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM for the threat-hunting-investigations team! :)
const rawValueToReplacement: Record<string, string> = invert(currentReplacements); | ||
const existingReplacement: string | undefined = rawValueToReplacement[rawValue]; | ||
|
||
return existingReplacement != null ? existingReplacement : v4(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank-you for the explanation here while pairing -- appreciate the context! 😅
|
||
${promptContextsContent} | ||
content: `${ | ||
isNewChat ? `${selectedSystemPrompt?.content ?? ''}\n\n` : '' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🙏
@@ -15,3 +15,10 @@ export interface Prompt { | |||
isDefault?: boolean; | |||
isNewConversationDefault?: boolean; | |||
} | |||
|
|||
export interface SelectedPromptContext { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Maybe live next to PromptContext
over in x-pack/packages/kbn-elastic-assistant/impl/assistant/prompt_context/types.ts
?
'file.Ext.original.path', | ||
'file.name', | ||
'file.path', | ||
'host.ip', // not a default allow field, but anonymized by default |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call, thank you! 👍
onClick={onAddNoteToTimeline} | ||
/> | ||
</EuiToolTip> | ||
<EuiFlexGroup alignItems="center" gutterSize="none"> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💪
@@ -41,7 +59,7 @@ export const getComments = ({ | |||
) : ( | |||
<EuiAvatar name="machine" size="l" color="subdued" iconType="logoSecurity" /> | |||
), | |||
timestamp: i18n.AT(message.timestamp), | |||
timestamp: i18n.AT(showAnonymizedValues ? message.timestamp : transformedMessage.timestamp), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the timestamp of the message itself, I don't think this needs anonymized?
const DEFAULT_ALLOW_KEY = `${LOCAL_STORAGE_KEY}.defaultAllow`; | ||
const DEFAULT_ALLOW_REPLACEMENT_KEY = `${LOCAL_STORAGE_KEY}.defaultAllowReplacement`; | ||
|
||
export const useAnonymizationStore = (): UseAnonymizationStore => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once the anonymization state lives in the AssistantContextProvider we won't need this anymore
baseAllow: string[]; | ||
baseAllowReplacement: string[]; | ||
defaultAllow: string[]; | ||
defaultAllowReplacement: string[]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe s/defaultAllowReplacement/defaultAllowAnonymization/g as we discussed? 🤷♂️ For another day 🙂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also maybe base
<-> default
? This is my fault since I named the basePromptContexts
and baseQuickPrompts
, sorry! 😬
@@ -202,11 +225,15 @@ export const AssistantProvider: React.FC<AssistantProviderProps> = ({ | |||
augmentMessageCodeBlocks, | |||
allQuickPrompts: localStorageQuickPrompts ?? [], | |||
allSystemPrompts: localStorageSystemPrompts ?? [], | |||
baseAllow: uniq(baseAllow), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ty! Will need similar sanitization/validation on the system/quick prompts side as well 👍
defaultAllow={defaultAllow} | ||
defaultAllowReplacement={defaultAllowReplacement} | ||
baseAllow={DEFAULT_ALLOW} | ||
baseAllowReplacement={DEFAULT_ALLOW_REPLACEMENT} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're getting beefy here but should clean up once we move these to localstorage inside the context 👍
*/ | ||
assistantEnabled: false, | ||
assistantEnabled: true, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚀 🌕
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
update: the output from CI suggests this change should be deferred to https://github.com/elastic/security-team/issues/6877 , so it will still be false
in this PR
import * as i18n from './translations'; | ||
|
||
const AnonymizationIconFlexItem = styled(EuiFlexItem)` | ||
margin-right: ${({ theme }) => theme.eui.euiSizeS}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note for me to update the Discover/O11y patch as euiTheme didn't work w/ them (probably where I put the integration in the tree...)
const AnonymizationSettingsModalComponent: React.FC<Props> = ({ closeModal }) => ( | ||
<EuiModal onClose={closeModal}> | ||
<EuiModalHeader /> | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checked out, tested locally, thoroughly pair-code reviewed -- LGTM!!! 👍 🚀 🙌
Thank you for coming in hot with the absolutely awesome anonymization and field selection feature-set here @andrew-goldstein! Above an beyond with the control and features needed for our users -- thank you! 🎉
d93fb45
to
9d65caf
Compare
The PR introduces the _Data anonymization_ feature to the _Elastic AI Assistant_: ![data-anonymization](https://github.com/elastic/kibana/assets/4459398/fa5147bb-e306-48e5-9819-2018e1ceaba3) _Above: Data anonymization in the Elastic AI Assistant_ ![toggle_show_anonymized](https://github.com/elastic/kibana/assets/4459398/7b31a939-1960-41bb-9cf1-1431d14ecc1f) _Above: Viewing the anonymized `host.name`, `user.name`, and `user.domain` fields in a conversation_ Use this feature to: - Control which fields are sent from a context to the assistant - Toggle anonymization on or off for specific fields - Set defaults for the above ### How it works When data anonymization is enabled for a context (e.g. an alert or an event), only a subset of the fields in the alert or event will be sent by default. Some fields will also be anonymized by default. When a field is anonymized, UUIDs are sent to the assistant in lieu of actual values. When responses are received from the assistant, the UUIDs are automatically translated back to their original values. - Elastic Security ships with a recommended set of default fields configured for anonymization - Simply accept the defaults, or edit any message before it's sent - Customize the defaults at any time ### See what was actually sent The `Show anonymized` toggle reveals the anonymized data that was sent, per the animated gif below: ![toggle_show_anonymized](https://github.com/elastic/kibana/assets/4459398/7b31a939-1960-41bb-9cf1-1431d14ecc1f) _Above: The `Show anonymized` toggle reveals the anonymized data_ ### Use Bulk actions to quickly customize what's sent ![bluk-actions](https://github.com/elastic/kibana/assets/4459398/55317830-b123-4631-8bb6-bea5dc36483b) _Above: bulk actions_ Apply the following bulk actions to customize any context sent to the assistant: - Allow - Deny - Anonymize - Unonymize ### Use Bulk actions to quickly customize defaults ![bulk-actions-default](https://github.com/elastic/kibana/assets/4459398/baa002d8-e3da-4ad7-ad2e-7ec611515bcc) _Above: Customize defaults with bulk actions_ Apply the following bulk actions to customize defaults: - Allow by default - Deny by default - Anonymize by default - Unonymize by default ### Row actions ![row-actions](https://github.com/elastic/kibana/assets/4459398/76496c07-1acf-4f71-a00c-fbd3ee7b30cc) _Above: The row actions overflow menu_ The following row actions are available on every row: - Allow - Deny - Anonymize - Unonymize - Allow by default - Deny by default - Anonymize by default - Unonymize by default ### Restore the "factory defaults" The _Anonymization defaults_ setting, shown in the screenshot below, may be used to restore the Elastic-provided defaults for which fields are allowed and anonymized: ![restore-defaults](https://github.com/elastic/kibana/assets/4459398/91f6762d-72eb-4e91-b2b9-d6001cf9171f) _Above: restoring the Elastic defaults_ See epic <elastic/security-team#6775> (internal) for additional details.
… `false` - PR feedback
bfaf9d7
to
ccf99c8
Compare
💛 Build succeeded, but was flaky
Failed CI StepsTest Failures
Metrics [docs]Module Count
Public APIs missing comments
Async chunks
Unknown metric groupsAPI count
ESLint disabled line counts
Total ESLint disabled count
History
To update your PR or re-run it, just comment with: |
…eration (RAG) for Alerts This PR implements _Retrieval Augmented Generation_ (RAG) for Alerts in the Security Solution. This feature enables users to ask the assistant questions about the latest and riskiest open alerts in their environment using natural language, for example: - _How many alerts are currently open?_ - _Which alerts should I look at first?_ - _Did we have any alerts with suspicious activity on Windows machines?_ ### More context Previously, the assistant relied solely on the knowledge of the configured LLM and _singular_ alerts or events passed _by the client_ to the LLM as prompt context. This new feature: - Enables _multiple_ alerts to be passed by the _server_ as context to the LLM, via [LangChain tools](elastic#167097) - Applies the user's [anonymization](elastic#159857) settings to those alerts - Only fields allowed by the user will be sent as context to the LLM - Users may enable or disable anonymization for specific fields (via settings) - Click the conversation's `Show anonymized` toggle to see the anonymized values sent to / received from the LLM: ![show_anonymized](https://github.com/elastic/kibana/assets/4459398/7db85f69-9352-4422-adbf-c97248ccb3dd) ### Settings This feature is enabled and configured via the `Knowledge Base` > `Alerts` settings in the screenshot below: ![rag_on_alerts_setting](https://github.com/elastic/kibana/assets/4459398/9161b6d4-b7c3-4f37-bcde-f032f5a02966) - The `Alerts` toggle enables or disables the feature - The slider has a range of `10` - `100` alerts (default: `20`) When the setting above is enabled, up to `n` alerts (as determined by the slider) that meet the following criteria will be returned: - the `kibana.alert.workflow_status` must be `open` - the alert must have been generated in the last `24 hours` - the alert must NOT be a `kibana.alert.building_block_type` alert - the `n` alerts are ordered by `kibana.alert.risk_score`, to prioritize the riskiest alerts ### Feature flag To use this feature: 1) Add the `assistantRagOnAlerts` feature flag to the `xpack.securitySolution.enableExperimental` setting in `config/kibana.yml` (or `config/kibana.dev.yml` in local development environments), per the example below: ``` xpack.securitySolution.enableExperimental: ['assistantRagOnAlerts'] ``` 2) Enable the `Alerts` toggle in the Assistant's `Knowledge Base` settings, per the screenshot below: ![alerts_toggle](https://github.com/elastic/kibana/assets/4459398/07f241ea-af4a-43a4-bd19-0dc6337db167) ## How it works - When the `Alerts` settings toggle is enabled, http `POST` requests to the `/internal/elastic_assistant/actions/connector/{id}/_execute` route include the following new (optional) parameters: - `alertsIndexPattern`, the alerts index for the current Kibana Space, e.g. `.alerts-security.alerts-default` - `allow`, the user's `Allowed` fields in the `Anonymization` settings, e.g. `["@timestamp", "cloud.availability_zone", "file.name", "user.name", ...]` - `allowReplacement`, the user's `Anonymized` fields in the `Anonymization` settings, e.g. `["cloud.availability_zone", "host.name", "user.name", ...]` - `replacements`, a `Record<string, string>` of replacements (generated on the server) that starts empty for a new conversation, and accumulates anonymized values until the conversation is cleared, e.g. ```json "replacements": { "e4f935c0-5a80-47b2-ac7f-816610790364": "Host-itk8qh4tjm", "cf61f946-d643-4b15-899f-6ffe3fd36097": "rpwmjvuuia", "7f80b092-fb1a-48a2-a634-3abc61b32157": "6astve9g6s", "f979c0d5-db1b-4506-b425-500821d00813": "Host-odqbow6tmc", // ... }, ``` - `size`, the numeric value set by the slider in the user's `Knowledge Base > Alerts` setting, e.g. `20` - The `postActionsConnectorExecuteRoute` function in `x-pack/plugins/elastic_assistant/server/routes/post_actions_connector_execute.ts` was updated to accept the new optional parameters, and to return an updated `replacements` with every response. (Every new request that is processed on the server may add additional anonymized values to the `replacements` returned in the response.) - The `callAgentExecutor` function in `x-pack/plugins/elastic_assistant/server/lib/langchain/execute_custom_llm_chain/index.ts` previously used a hard-coded array of LangChain tools that had just one entry, for the `ESQLKnowledgeBaseTool` tool. That hard-coded array was replaced in this PR with a call to the (new) `getApplicableTools` function: ```typescript const tools: Tool[] = getApplicableTools({ allow, allowReplacement, alertsIndexPattern, assistantLangChain, chain, esClient, modelExists, onNewReplacements, replacements, request, size, }); ``` - The `getApplicableTools` function in `x-pack/plugins/elastic_assistant/server/lib/langchain/tools/index.ts` examines the parameters in the `KibanaRequest` and only returns a filtered set of LangChain tools. If the request doesn't contain all the parameters required by a tool, it will NOT be returned by `getApplicableTools`. For example, if the required anonymization parameters are not included in the request, the `open-alerts` tool will not be returned. - The new `alert-counts` LangChain tool returned by the `getAlertCountsTool` function in `x-pack/plugins/elastic_assistant/server/lib/langchain/tools/alert_counts/get_alert_counts_tool.ts` provides the LLM the results of an aggregation on the last `24` hours of alerts (in the current Kibana Space), grouped by `kibana.alert.severity`. See the `getAlertsCountQuery` function in `x-pack/plugins/elastic_assistant/server/lib/langchain/tools/alert_counts/get_alert_counts_query.ts` for details - The new `open-alerts` LangChain tool returned by the `getOpenAlertsTool` function in `x-pack/plugins/elastic_assistant/server/lib/langchain/tools/open_alerts/get_open_alerts_tool.ts` provides the LLM up to `size` non-building-block alerts generated in the last `24` hours (in the current Kibana Space) with an `open` workflow status, ordered by `kibana.alert.risk_score` to prioritize the riskiest alerts. See the `getOpenAlertsQuery` function in `x-pack/plugins/elastic_assistant/server/lib/langchain/tools/open_alerts/get_open_alerts_query.ts` for details. - On the client, a conversation continues to accumulate additional `replacements` (and send them in subsequent requests) until the conversation is cleared - Anonymization functions that were only invoked by the browser were moved from the (browser) `kbn-elastic-assistant` package in `x-pack/packages/kbn-elastic-assistant/` to a new common package: `x-pack/packages/kbn-elastic-assistant-common` - The new `kbn-elastic-assistant-common` package is also consumed by the `elastic_assistant` (server) plugin: `x-pack/plugins/elastic_assistant`
…tion (RAG) for Alerts (#172542) ## [Security Solution] [Elastic AI Assistant] Retrieval Augmented Generation (RAG) for Alerts This PR implements _Retrieval Augmented Generation_ (RAG) for Alerts in the Security Solution. This feature enables users to ask the assistant questions about the latest and riskiest open alerts in their environment using natural language, for example: - _How many alerts are currently open?_ - _Which alerts should I look at first?_ - _Did we have any alerts with suspicious activity on Windows machines?_ ### More context Previously, the assistant relied solely on the knowledge of the configured LLM and _singular_ alerts or events passed _by the client_ to the LLM as prompt context. This new feature: - Enables _multiple_ alerts to be passed by the _server_ as context to the LLM, via [LangChain tools](#167097) - Applies the user's [anonymization](#159857) settings to those alerts - Only fields allowed by the user will be sent as context to the LLM - Users may enable or disable anonymization for specific fields (via settings) - Click the conversation's `Show anonymized` toggle to see the anonymized values sent to / received from the LLM: ![show_anonymized](https://github.com/elastic/kibana/assets/4459398/7db85f69-9352-4422-adbf-c97248ccb3dd) ### Settings This feature is enabled and configured via the `Knowledge Base` > `Alerts` settings in the screenshot below: ![rag_on_alerts_setting](https://github.com/elastic/kibana/assets/4459398/9161b6d4-b7c3-4f37-bcde-f032f5a02966) - The `Alerts` toggle enables or disables the feature - The slider has a range of `10` - `100` alerts (default: `20`) When the setting above is enabled, up to `n` alerts (as determined by the slider) that meet the following criteria will be returned: - the `kibana.alert.workflow_status` must be `open` - the alert must have been generated in the last `24 hours` - the alert must NOT be a `kibana.alert.building_block_type` alert - the `n` alerts are ordered by `kibana.alert.risk_score`, to prioritize the riskiest alerts ### Feature flag To use this feature: 1) Add the `assistantRagOnAlerts` feature flag to the `xpack.securitySolution.enableExperimental` setting in `config/kibana.yml` (or `config/kibana.dev.yml` in local development environments), per the example below: ``` xpack.securitySolution.enableExperimental: ['assistantRagOnAlerts'] ``` 2) Enable the `Alerts` toggle in the Assistant's `Knowledge Base` settings, per the screenshot below: ![alerts_toggle](https://github.com/elastic/kibana/assets/4459398/07f241ea-af4a-43a4-bd19-0dc6337db167) ## How it works - When the `Alerts` settings toggle is enabled, http `POST` requests to the `/internal/elastic_assistant/actions/connector/{id}/_execute` route include the following new (optional) parameters: - `alertsIndexPattern`, the alerts index for the current Kibana Space, e.g. `.alerts-security.alerts-default` - `allow`, the user's `Allowed` fields in the `Anonymization` settings, e.g. `["@timestamp", "cloud.availability_zone", "file.name", "user.name", ...]` - `allowReplacement`, the user's `Anonymized` fields in the `Anonymization` settings, e.g. `["cloud.availability_zone", "host.name", "user.name", ...]` - `replacements`, a `Record<string, string>` of replacements (generated on the server) that starts empty for a new conversation, and accumulates anonymized values until the conversation is cleared, e.g. ```json "replacements": { "e4f935c0-5a80-47b2-ac7f-816610790364": "Host-itk8qh4tjm", "cf61f946-d643-4b15-899f-6ffe3fd36097": "rpwmjvuuia", "7f80b092-fb1a-48a2-a634-3abc61b32157": "6astve9g6s", "f979c0d5-db1b-4506-b425-500821d00813": "Host-odqbow6tmc", // ... }, ``` - `size`, the numeric value set by the slider in the user's `Knowledge Base > Alerts` setting, e.g. `20` - The `postActionsConnectorExecuteRoute` function in `x-pack/plugins/elastic_assistant/server/routes/post_actions_connector_execute.ts` was updated to accept the new optional parameters, and to return an updated `replacements` with every response. (Every new request that is processed on the server may add additional anonymized values to the `replacements` returned in the response.) - The `callAgentExecutor` function in `x-pack/plugins/elastic_assistant/server/lib/langchain/execute_custom_llm_chain/index.ts` previously used a hard-coded array of LangChain tools that had just one entry, for the `ESQLKnowledgeBaseTool` tool. That hard-coded array was replaced in this PR with a call to the (new) `getApplicableTools` function: ```typescript const tools: Tool[] = getApplicableTools({ allow, allowReplacement, alertsIndexPattern, assistantLangChain, chain, esClient, modelExists, onNewReplacements, replacements, request, size, }); ``` - The `getApplicableTools` function in `x-pack/plugins/elastic_assistant/server/lib/langchain/tools/index.ts` examines the parameters in the `KibanaRequest` and only returns a filtered set of LangChain tools. If the request doesn't contain all the parameters required by a tool, it will NOT be returned by `getApplicableTools`. For example, if the required anonymization parameters are not included in the request, the `open-alerts` tool will not be returned. - The new `alert-counts` LangChain tool returned by the `getAlertCountsTool` function in `x-pack/plugins/elastic_assistant/server/lib/langchain/tools/alert_counts/get_alert_counts_tool.ts` provides the LLM the results of an aggregation on the last `24` hours of alerts (in the current Kibana Space), grouped by `kibana.alert.severity`. See the `getAlertsCountQuery` function in `x-pack/plugins/elastic_assistant/server/lib/langchain/tools/alert_counts/get_alert_counts_query.ts` for details - The new `open-alerts` LangChain tool returned by the `getOpenAlertsTool` function in `x-pack/plugins/elastic_assistant/server/lib/langchain/tools/open_alerts/get_open_alerts_tool.ts` provides the LLM up to `size` non-building-block alerts generated in the last `24` hours (in the current Kibana Space) with an `open` workflow status, ordered by `kibana.alert.risk_score` to prioritize the riskiest alerts. See the `getOpenAlertsQuery` function in `x-pack/plugins/elastic_assistant/server/lib/langchain/tools/open_alerts/get_open_alerts_query.ts` for details. - On the client, a conversation continues to accumulate additional `replacements` (and send them in subsequent requests) until the conversation is cleared - Anonymization functions that were only invoked by the browser were moved from the (browser) `kbn-elastic-assistant` package in `x-pack/packages/kbn-elastic-assistant/` to a new common package: `x-pack/packages/kbn-elastic-assistant-common` - The new `kbn-elastic-assistant-common` package is also consumed by the `elastic_assistant` (server) plugin: `x-pack/plugins/elastic_assistant`
[Security Solution] [Elastic AI Assistant] Data anonymization
The PR introduces the Data anonymization feature to the Elastic AI Assistant:
Above: Data anonymization in the Elastic AI Assistant
Above: Viewing the anonymized
host.name
,user.name
, anduser.domain
fields in a conversationUse this feature to:
How it works
When data anonymization is enabled for a context (e.g. an alert or an event), only a subset of the fields in the alert or event will be sent by default.
Some fields will also be anonymized by default. When a field is anonymized, UUIDs are sent to the assistant in lieu of actual values. When responses are received from the assistant, the UUIDs are automatically translated back to their original values.
See what was actually sent
The
Show anonymized
toggle reveals the anonymized data that was sent, per the animated gif below:Above: The
Show anonymized
toggle reveals the anonymized dataUse Bulk actions to quickly customize what's sent
Above: bulk actions
Apply the following bulk actions to customize any context sent to the assistant:
Use Bulk actions to quickly customize defaults
Above: Customize defaults with bulk actions
Apply the following bulk actions to customize defaults:
Row actions
Above: The row actions overflow menu
The following row actions are available on every row:
Restore the "factory defaults"
The Anonymization defaults setting, shown in the screenshot below, may be used to restore the Elastic-provided defaults for which fields are allowed and anonymized:
Above: restoring the Elastic defaults
See epic https://github.com/elastic/security-team/issues/6775 (internal) for additional details.