Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Security Solution] Pass rule execution statuses and metrics to Alerting Framework #130966

Closed
4 tasks done
banderror opened this issue Apr 26, 2022 · 7 comments
Closed
4 tasks done
Assignees
Labels
8.7 candidate Feature:Rule Monitoring Security Solution Detection Rule Monitoring area Team:Detection Rule Management Security Detection Rule Management Team Team:Detections and Resp Security Detection Response Team Team: SecuritySolution Security Solutions Team working on SIEM, Endpoint, Timeline, Resolver, etc. v8.7.0

Comments

@banderror
Copy link
Contributor

banderror commented Apr 26, 2022

Depends on: #135127, #112193

Summary

We're consolidating rule execution statuses and metrics between Security Solution and Alerting Framework (see #112193 and internal RFC). When this is done and we have an API for passing statuses and metrics from our rule executors to the Framework, we will integrate with it and stop using our sidecar saved objects for storing this data.

  • Pass execution statuses and metrics from rule executors to the Framework.
  • Fetch execution statuses and metrics from rules themselves instead of the sidecar siem-detection-engine-rule-execution-info saved objects.
  • Remove the siem-detection-engine-rule-execution-info saved objects type from the codebase. Mark it as deleted in Kibana Core.
  • Make sure to keep backward compatibility in the Detection API endpoints and rule execution events we write into the Event Log.
@banderror banderror added Team:Detections and Resp Security Detection Response Team Team: SecuritySolution Security Solutions Team working on SIEM, Endpoint, Timeline, Resolver, etc. Feature:Rule Monitoring Security Solution Detection Rule Monitoring area Team:Detection Rule Management Security Detection Rule Management Team labels Apr 26, 2022
@elasticmachine
Copy link
Contributor

Pinging @elastic/security-solution (Team: SecuritySolution)

@elasticmachine
Copy link
Contributor

Pinging @elastic/security-detections-response (Team:Detections and Resp)

@spong
Copy link
Member

spong commented Jun 28, 2022

Pushing the implementation of adding alerts created/detected metrics till we have an API for returning execution metrics at the end of rule execution. Please see this previous PR as an example for adding these metrics: #126210

@jpdjere
Copy link
Contributor

jpdjere commented Dec 27, 2022

@peluja1012 @jethr0null and CC @banderror

As part of the effort to get rid of the sidecar saved object and move the rule’s execution results to the rule object itself, we had to slightly alter the way in which the execution results are stored.

Before, during rule execution we wrote multiple times to the saved object: for example, once when the rule started running, another time for any warnings that would be reported and finally once more for final errors when execution ended.

This allowed us to log intermediate execution statuses which we would temporarily show in the Rules table and the Rule Details page - until the rule would write its final execution status. Examples of such intermediate statuses are:

  • Warning: no concrete indices matching an index pattern were found
  • Warning: the rule doesn't have read access to certain indices
  • Warning: certain indices don't have a timestamp field

Now, all status updates are stored in memory during execution and once execution is over we write the results to the rule objects only once. Which means we can't show the intermediate statuses in the UI.

To be super-clear, in most cases the warnings mentioned above will be final execution statuses. The only way they could be intermediate is when some error happens after them, which should happen less often.

@jethr0null Could you help us understand if that represents a problem from the product perspective? Options we see:

  1. If not a problem:
    1.1. We will keep the existing implementation on our side and in the Alerting Framework which has been aligned with Xavier and Mike.
  2. If a problem:
    2.1. Hacky workaround: we could somehow concatenate the intermediate and the final execution statuses and show the concatenated message in the UI. This would require changes in the Alerting Framework. No guarantees that it would be accepted by the RAM team.
    2.2. Show intermediate warnings in the Execution events table which is now hidden behind a feature switch (PR). Remove the feature switch and polish the table, if needed.

Below is an example of what the hacky workaround (2.1) could look like. It shows the Rule Details page for a rule that executed and generated an intermediate warning, and then failed with an error (all during the same rule execution). In order to be able to display both the warning and the error message we had to concatenate them and then format them in the UI to display as separate warnings.

image

And below you can see an example of the same workaround but with the same warning and error being displayed in the tooltip for Last Response column for the Rules table:

image

Notice the WARNING: and ERROR: prefixes that we added to each message to clearly distinguish them as separate events that occurred during execution.

Let us know which option you'd prefer or if you have any other ideas.

@ARWNightingale ARWNightingale pinned this issue Jan 3, 2023
@ARWNightingale ARWNightingale unpinned this issue Jan 3, 2023
@peluja1012
Copy link
Contributor

To be super-clear, in most cases the warnings mentioned above will be final execution statuses. The only way they could be intermediate is when some error happens after them, which should happen less often.

I'm ok with option 1.1 given the above statement. To be honest, I thought we only wrote the final status in our current implementation.

@banderror
Copy link
Contributor Author

To be honest, I thought we only wrote the final status in our current implementation.

@peluja1012 No, that's still not the case. Our rule executors (mostly the "security wrapper" function) need to be adjusted to commit a status update only once at the end of the execution. Currently, they do it from multiple places in the code and sometimes it leads to logging intermediate statuses. Besides the example above, a "gap detected" failure can be an intermediate status in some conditions, so the status updates during a single execution may look like that:

running -> failed (gap detected) -> warning (no timestamp field) -> failed (exception list not found)

Instead of logging intermediate warnings and errors as status updates, we should log them as simple message events in the Event Log.

maximpn pushed a commit that referenced this issue Jan 27, 2023
…ead of saved object (#147035)

**Addresses:** #130966
**Based on:** #135127

## Summary

This PR deprecates the Sidecar SO of type `siem-detection-engine-rule-execution-info` in favour of storing Rule Execution Logging data within the rule itself, making use of the work previously done in the Alerting Framework:
- #140882
- #147278

Work done:
- **Pass execution statuses and metrics from rule executors to the Framework:** through the use of `RuleMonitoringService` and `RuleResultService` from within the rule execution log client for executor. `x-pack/plugins/security_solution/server/lib/detection_engine/rule_monitoring/logic/rule_execution_log/client_for_executors/client.ts`
- **Fetch execution statuses and metrics from rules themselves instead of the sidecar `siem-detection-engine-rule-execution-info` saved objects**: through the use of the new function `createRuleExecutionSummary` in `x-pack/plugins/security_solution/server/lib/detection_engine/rule_monitoring/logic/rule_execution_log/create_rule_execution_summary.ts`, which extracts last execution information from the rule itself.
- **Remove the siem-detection-engine-rule-execution-info saved objects type from the codebase. Mark it as deleted in Kibana Core:** added `siem-detection-engine-rule-execution-info` to `packages/core/saved-objects/core-saved-objects-migration-server-internal/src/core/unused_types.ts`; and got rid of the related Saved Object client.
- **Make sure to keep backward compatibility in the Detection API endpoints and rule execution events we write into the Event Log**: API compatibility is maintained. No breaking changes.


### Checklist

- [ ] [Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html) was added for features that require explanation or tutorials
- [x] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios
@banderror
Copy link
Contributor Author

Addressed in #147035

kqualters-elastic pushed a commit to kqualters-elastic/kibana that referenced this issue Feb 6, 2023
…ead of saved object (elastic#147035)

**Addresses:** elastic#130966
**Based on:** elastic#135127

## Summary

This PR deprecates the Sidecar SO of type `siem-detection-engine-rule-execution-info` in favour of storing Rule Execution Logging data within the rule itself, making use of the work previously done in the Alerting Framework:
- elastic#140882
- elastic#147278

Work done:
- **Pass execution statuses and metrics from rule executors to the Framework:** through the use of `RuleMonitoringService` and `RuleResultService` from within the rule execution log client for executor. `x-pack/plugins/security_solution/server/lib/detection_engine/rule_monitoring/logic/rule_execution_log/client_for_executors/client.ts`
- **Fetch execution statuses and metrics from rules themselves instead of the sidecar `siem-detection-engine-rule-execution-info` saved objects**: through the use of the new function `createRuleExecutionSummary` in `x-pack/plugins/security_solution/server/lib/detection_engine/rule_monitoring/logic/rule_execution_log/create_rule_execution_summary.ts`, which extracts last execution information from the rule itself.
- **Remove the siem-detection-engine-rule-execution-info saved objects type from the codebase. Mark it as deleted in Kibana Core:** added `siem-detection-engine-rule-execution-info` to `packages/core/saved-objects/core-saved-objects-migration-server-internal/src/core/unused_types.ts`; and got rid of the related Saved Object client.
- **Make sure to keep backward compatibility in the Detection API endpoints and rule execution events we write into the Event Log**: API compatibility is maintained. No breaking changes.


### Checklist

- [ ] [Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html) was added for features that require explanation or tutorials
- [x] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
8.7 candidate Feature:Rule Monitoring Security Solution Detection Rule Monitoring area Team:Detection Rule Management Security Detection Rule Management Team Team:Detections and Resp Security Detection Response Team Team: SecuritySolution Security Solutions Team working on SIEM, Endpoint, Timeline, Resolver, etc. v8.7.0
Projects
None yet
Development

No branches or pull requests

6 participants