[Security Solution] Pass rule execution statuses and metrics to Alerting Framework #130966

banderror · 2022-04-26T11:10:15Z

Summary

We're consolidating rule execution statuses and metrics between Security Solution and Alerting Framework (see #112193 and internal RFC). When this is done and we have an API for passing statuses and metrics from our rule executors to the Framework, we will integrate with it and stop using our sidecar saved objects for storing this data.

Pass execution statuses and metrics from rule executors to the Framework.
Fetch execution statuses and metrics from rules themselves instead of the sidecar siem-detection-engine-rule-execution-info saved objects.
Remove the siem-detection-engine-rule-execution-info saved objects type from the codebase. Mark it as deleted in Kibana Core.
Make sure to keep backward compatibility in the Detection API endpoints and rule execution events we write into the Event Log.

The text was updated successfully, but these errors were encountered:

elasticmachine · 2022-04-26T11:10:17Z

Pinging @elastic/security-solution (Team: SecuritySolution)

elasticmachine · 2022-04-26T11:10:17Z

Pinging @elastic/security-detections-response (Team:Detections and Resp)

spong · 2022-06-28T01:54:43Z

Pushing the implementation of adding alerts created/detected metrics till we have an API for returning execution metrics at the end of rule execution. Please see this previous PR as an example for adding these metrics: #126210

jpdjere · 2022-12-27T17:57:17Z

@peluja1012 @jethr0null and CC @banderror

As part of the effort to get rid of the sidecar saved object and move the rule’s execution results to the rule object itself, we had to slightly alter the way in which the execution results are stored.

Before, during rule execution we wrote multiple times to the saved object: for example, once when the rule started running, another time for any warnings that would be reported and finally once more for final errors when execution ended.

This allowed us to log intermediate execution statuses which we would temporarily show in the Rules table and the Rule Details page - until the rule would write its final execution status. Examples of such intermediate statuses are:

Warning: no concrete indices matching an index pattern were found
Warning: the rule doesn't have read access to certain indices
Warning: certain indices don't have a timestamp field

Now, all status updates are stored in memory during execution and once execution is over we write the results to the rule objects only once. Which means we can't show the intermediate statuses in the UI.

To be super-clear, in most cases the warnings mentioned above will be final execution statuses. The only way they could be intermediate is when some error happens after them, which should happen less often.

@jethr0null Could you help us understand if that represents a problem from the product perspective? Options we see:

If not a problem:
1.1. We will keep the existing implementation on our side and in the Alerting Framework which has been aligned with Xavier and Mike.
If a problem:
2.1. Hacky workaround: we could somehow concatenate the intermediate and the final execution statuses and show the concatenated message in the UI. This would require changes in the Alerting Framework. No guarantees that it would be accepted by the RAM team.
2.2. Show intermediate warnings in the Execution events table which is now hidden behind a feature switch (PR). Remove the feature switch and polish the table, if needed.

Below is an example of what the hacky workaround (2.1) could look like. It shows the Rule Details page for a rule that executed and generated an intermediate warning, and then failed with an error (all during the same rule execution). In order to be able to display both the warning and the error message we had to concatenate them and then format them in the UI to display as separate warnings.

And below you can see an example of the same workaround but with the same warning and error being displayed in the tooltip for Last Response column for the Rules table:

Notice the WARNING: and ERROR: prefixes that we added to each message to clearly distinguish them as separate events that occurred during execution.

Let us know which option you'd prefer or if you have any other ideas.

peluja1012 · 2023-01-04T03:33:23Z

To be super-clear, in most cases the warnings mentioned above will be final execution statuses. The only way they could be intermediate is when some error happens after them, which should happen less often.

I'm ok with option 1.1 given the above statement. To be honest, I thought we only wrote the final status in our current implementation.

banderror · 2023-01-04T10:57:54Z

To be honest, I thought we only wrote the final status in our current implementation.

@peluja1012 No, that's still not the case. Our rule executors (mostly the "security wrapper" function) need to be adjusted to commit a status update only once at the end of the execution. Currently, they do it from multiple places in the code and sometimes it leads to logging intermediate statuses. Besides the example above, a "gap detected" failure can be an intermediate status in some conditions, so the status updates during a single execution may look like that:

running -> failed (gap detected) -> warning (no timestamp field) -> failed (exception list not found)

Instead of logging intermediate warnings and errors as status updates, we should log them as simple message events in the Event Log.

…ead of saved object (#147035) **Addresses:** #130966 **Based on:** #135127 ## Summary This PR deprecates the Sidecar SO of type `siem-detection-engine-rule-execution-info` in favour of storing Rule Execution Logging data within the rule itself, making use of the work previously done in the Alerting Framework: - #140882 - #147278 Work done: - **Pass execution statuses and metrics from rule executors to the Framework:** through the use of `RuleMonitoringService` and `RuleResultService` from within the rule execution log client for executor. `x-pack/plugins/security_solution/server/lib/detection_engine/rule_monitoring/logic/rule_execution_log/client_for_executors/client.ts` - **Fetch execution statuses and metrics from rules themselves instead of the sidecar `siem-detection-engine-rule-execution-info` saved objects**: through the use of the new function `createRuleExecutionSummary` in `x-pack/plugins/security_solution/server/lib/detection_engine/rule_monitoring/logic/rule_execution_log/create_rule_execution_summary.ts`, which extracts last execution information from the rule itself. - **Remove the siem-detection-engine-rule-execution-info saved objects type from the codebase. Mark it as deleted in Kibana Core:** added `siem-detection-engine-rule-execution-info` to `packages/core/saved-objects/core-saved-objects-migration-server-internal/src/core/unused_types.ts`; and got rid of the related Saved Object client. - **Make sure to keep backward compatibility in the Detection API endpoints and rule execution events we write into the Event Log**: API compatibility is maintained. No breaking changes. ### Checklist - [ ] [Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html) was added for features that require explanation or tutorials - [x] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios

banderror · 2023-01-27T16:58:20Z

Addressed in #147035

…ead of saved object (elastic#147035) **Addresses:** elastic#130966 **Based on:** elastic#135127 ## Summary This PR deprecates the Sidecar SO of type `siem-detection-engine-rule-execution-info` in favour of storing Rule Execution Logging data within the rule itself, making use of the work previously done in the Alerting Framework: - elastic#140882 - elastic#147278 Work done: - **Pass execution statuses and metrics from rule executors to the Framework:** through the use of `RuleMonitoringService` and `RuleResultService` from within the rule execution log client for executor. `x-pack/plugins/security_solution/server/lib/detection_engine/rule_monitoring/logic/rule_execution_log/client_for_executors/client.ts` - **Fetch execution statuses and metrics from rules themselves instead of the sidecar `siem-detection-engine-rule-execution-info` saved objects**: through the use of the new function `createRuleExecutionSummary` in `x-pack/plugins/security_solution/server/lib/detection_engine/rule_monitoring/logic/rule_execution_log/create_rule_execution_summary.ts`, which extracts last execution information from the rule itself. - **Remove the siem-detection-engine-rule-execution-info saved objects type from the codebase. Mark it as deleted in Kibana Core:** added `siem-detection-engine-rule-execution-info` to `packages/core/saved-objects/core-saved-objects-migration-server-internal/src/core/unused_types.ts`; and got rid of the related Saved Object client. - **Make sure to keep backward compatibility in the Detection API endpoints and rule execution events we write into the Event Log**: API compatibility is maintained. No breaking changes. ### Checklist - [ ] [Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html) was added for features that require explanation or tutorials - [x] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios

banderror added the 8.3 candidate label Apr 27, 2022

banderror added 8.5 candidate and removed 8.3 candidate labels Jun 10, 2022

banderror mentioned this issue Jun 27, 2022

[Security Solution][Detections] Adds 'Alerts Detected' and 'Alerts Created' metrics to Rule Monitoring table #126210

Closed

4 tasks

banderror mentioned this issue Jul 25, 2022

[Security Solution][Detections] Extended rule execution logging to Event Log #126063

Merged

11 tasks

This was referenced Aug 16, 2022

[Security Solution] Add a filter by rule execution status to the Rules table #138903

Closed

[Security Solution] Get rid of "Advanced sorting" switch for the Rules table #138907

Closed

banderror added 8.6 candidate and removed 8.5 candidate labels Oct 5, 2022

banderror changed the title ~~[Security Solution][Detections] Pass rule execution statuses and metrics to Alerting Framework~~ [Security Solution] Pass rule execution statuses and metrics to Alerting Framework Nov 24, 2022

banderror added 8.7 candidate and removed 8.6 candidate labels Nov 24, 2022

banderror mentioned this issue Nov 24, 2022

[Security Solution] Consolidating Rule Management with Alerting Framework #133560

Open

banderror mentioned this issue Dec 19, 2022

[Security Solution] Write and read Rule Execution Logs from rule instead of saved object #147035

Merged

2 tasks

banderror assigned jpdjere Dec 19, 2022

banderror assigned maximpn Dec 29, 2022

ARWNightingale pinned this issue Jan 3, 2023

ARWNightingale unpinned this issue Jan 3, 2023

maximpn mentioned this issue Jan 12, 2023

[ResponseOps] Add intermediate status info to outcomeMsg of Rule lastRun #148142

Closed

1 task

spong mentioned this issue Jan 19, 2023

[Security Solution][Alerts] Alert suppression time window #148868

Merged

banderror closed this as completed Jan 27, 2023

banderror mentioned this issue Jan 27, 2023

[Security Solution] Improve RuleExecutionLog performance #118511

Closed

4 tasks

banderror added the v8.7.0 label Jan 27, 2023

banderror mentioned this issue Jan 27, 2023

[Alerting] Storing custom searchable rule execution data inside the rule #112193

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Security Solution] Pass rule execution statuses and metrics to Alerting Framework #130966

[Security Solution] Pass rule execution statuses and metrics to Alerting Framework #130966

banderror commented Apr 26, 2022 •

edited

Loading

elasticmachine commented Apr 26, 2022

elasticmachine commented Apr 26, 2022

spong commented Jun 28, 2022

jpdjere commented Dec 27, 2022 •

edited

Loading

peluja1012 commented Jan 4, 2023

banderror commented Jan 4, 2023

banderror commented Jan 27, 2023

[Security Solution] Pass rule execution statuses and metrics to Alerting Framework #130966

[Security Solution] Pass rule execution statuses and metrics to Alerting Framework #130966

Comments

banderror commented Apr 26, 2022 • edited Loading

Summary

elasticmachine commented Apr 26, 2022

elasticmachine commented Apr 26, 2022

spong commented Jun 28, 2022

jpdjere commented Dec 27, 2022 • edited Loading

peluja1012 commented Jan 4, 2023

banderror commented Jan 4, 2023

banderror commented Jan 27, 2023

banderror commented Apr 26, 2022 •

edited

Loading

jpdjere commented Dec 27, 2022 •

edited

Loading