-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SIEM][Detection Engine] Meta issue for alerting needs #50222
Comments
Pinging @elastic/siem (Team:SIEM) |
Some notes about the tags we're using on SIEM: In this PR, I am utilizing the One thing though in that PR I want to point out is that what I do that is filter out any internal tags before returning them to the user so they only see the tags they added: tags.filter(tag => !tag.startsWith(INTERNAL_IDENTIFIER)); I could almost benefit from two different tags on alerts. One for the user to be able to enter data against (the UI), and a second that is for internal structures only. For now, though, using an internal identifier and filtering seems to work out well though. Only caveat is that searches including my identifier will come back as positive hits but that should be rare as I begin the internal tags with The other thing in that PR which is of note, is there doesn't seem to be an aggregate functionality on attributes of saved objects, so I have to do a slow look up of all unique tags for the UI by paging through them all and getting the unique set of tags. The UI is going to use this to present to the user the set of all tags they entered without duplicates and allow them to select one and then filter based on only that one. |
Changed ordering where delete API key is highest priority and a blocker for us. We just need some way to delete those API keys since they are generated on enabled/disabled |
New issue and re-arrangement today. I added this one to the list above: Pass down space id and other parameters Use case: Technical solution: Current workaround: |
Adding #62532 as a reference, as this issue highlights the use case for the ability to refresh API tokens before the executor runs to ensure the most recent roles/permissions are available. |
"Reviewed by Frank Hassanabad on 7/29/2020, still valid as of this date" Only notes is that some of the tech debt or workarounds might make it difficult to implement just this one part which is:
So I will move that more towards the bottom as far as requirements go. |
@FrankHassanabad the new Elasticsearch client is available for alert types to use. It can be accessed under |
I gave these a pass to identify what can be unblocked here. I have a couple of notes worth considering:
Regarding this requirement - this API is available on Task Manager, so we actually have a work around for this. That will tell TM to run the alert now as if it was scheduled to run now.
We delivered the ability to use specific IDs in 7.12, but the requirement is that these IDs be uuids. The Security team don't want ESOs to have hard coded IDs, as that makes it easier to identify the encrypted document and could be considered a security hole. You could in theory, obviously, use hard coded UUIDs, but that defeats the purpose of the security baked into ESOs. |
Pinging @elastic/security-detections-response (Team:Detections and Resp) |
lol...bye bye ticket! :-) |
This is meta ticket around ad-hoc requirements and feature requests from the detection engine underneath SIEM back to the alerting/actions plugins.
Top asks from our side along with use cases, possible solutions, and workarounds. These are ordered in our best guess of priority ranking in which we would like to have them.
Update 3/30/21, Frank Hassanabad: Updated to drop things lower in priority that have been fixed and push up the more higher priority items. Any questions about this ticket please consult with @spong or @peluja1012 as they are more in tune with the recent needs of detection engine.
Make rules sortable, filterable, and aggregatable
Issue:
#50213
Use Case:
As a rule user, I need to sort, filter, and sometimes aggregate on rule types. For example, I need to sort my rule types on severity, or I need to filter them by severity.
Technical solution:
The current alerting/actions does not allow mapping down to the level of alerting/actions parameters. Therefore we cannot use the saved objects API of kql mixed with "order by". If that were changed and we were allowed mapping abilities to the alerting/actions parameters that would solve this. Either that or a plain API (even if slower like a table scan) to abstract us away so we can natively to the actions/alerting objects would make it to where we don't have write our own hand rolled solutions.
Current workaround:
Nothing written, but we should be able to use the KQL order by and tags once that is checked in to part of the way. However, for alerting/actions params storage we cannot easily sort/filter on those without using a table scan type technique where we do a "find" on a page of results and go through each and every one.
Make alerts capable of being run ad-hoc/immediately
Issue:
#50215
Use case:
As a rule user, I need to be able to immediately run a rule from time to time. A rule could have failed multiple times during a day due to timeout issues or networking issues or the rule could have had a bug or mistake in it in which case I need to modify and then re-run the rule. As a rule user, I need to sometimes create a rule which needs to run every 5 minutes against data 5 minutes ago but need to run that rule immediately.
Technical solution:
API from the alerting team which provides the capability.
Current workaround:
Nothing written, but we can ad-hoc create a hidden duplicate alert temporarily which runs and then deletes its self at the end of the run. This would be an additional parameter to the rule and incurs technical debt to be removed later.
Bulk create, read, update, delete for alert client
Issue:
#53144
Use Case:
As a rule user, I need to perform bulk actions such as enabling a lot of rules at once or disabling them deleting them.
Technical solution:
Add a bulk action capability to the API
Current workaround:
Call them one at a time in a
forEach
loopPost our own rule id
Issue:
#50210
Use case:
As a rule user, I want to package and identify my own rules without having duplicates existing in the system. As a rule user, I want to be able to share these rules with multiple Kibana systems under different divisions and different companies as well as update them among the different companies/divisions. As a rule user, I will want to release updates to these rules across companies. As a rule user, I want to be able to export and import rules without having duplicates showing up within the system. These imports and exports can be across divisions and companies as well.
Technical solution:
We need the ability to POST our own
_id
fields for ourrule_id
. This is important for packaging rules up and distributing them and not having any duplicate rule_id's. Currently we are using rule_id as a parameter to the rules which is a slow mechanism. We can use tags to speed this up but errors or bugs can and will lead to duplicate rule_id's.Current workaround:
We post our own rule parameter in the alerting parameters and we can sometimes get duplicates showing up due to socket timeouts or uncaught promises/unhandled promise rejections as well as bugs. We will have to write code to interpret and find duplicates and remove those rule duplicates or allow the user to remove them when they inevitably happen.
Migration hooks
Issue:
#50216
Use case:
As a rule user, I will upgrade my system and expect to not have to do manual maintenance on rules or see rules not operate suddenly.
Technical solution:
We need a way to "hook into" migration code which can run on the alerting/actions side and/or our SIEM side to provide ways to fix data bugs and/or add features such as migrations as we progress the system.
Current workaround:
Update 3/30/21: We can manually write migrations directly in the alerting project and do pull requests there.
We have to manually cleanup mistakes from older systems. Luckily this is our first cut at these systems so we do not have legacy data at the moment but after our first release we will inevitably have user generated data which will need to be upgraded.
Control compression and timeout (Update 3/3/2021: We need async calls)
Issue:
#50212
#50217
Updated Use case (3/3/2021):
We might need to enable large volumes of rule runs but we handicap things to 100 max signals at the moment. If we increase this we will need to more efficiently push data. However, we do need longer rule run times and either we need something with async or we need longer timeouts from task manager and calling into elasticsearch.
Use case:
As a rule user, I sometimes need to trigger large volumes of signals over a longer period of time or even a short period of time but with a very active rule. When triggering that rule, I would like the engine to run as quickly as possible and right now the rate is very slow due to non-compression of the data. Sometimes I encounter timeouts and would like to control the timeouts as well for wild card or complex rules which might be run less frequently.
Technical solution:
We need to enable compression/gzip with the callCluster API (From kibana -> elastic search). This should be comparable to
elasticsearch-js
compression. This is needed for when large amounts of JSON is transmitted due to large amounts of signals. Looks like we are using the legacy client but would like an upgrade to the newer system. We also need to get to a per connection based timeout model to configure different rules with different timeouts.Note: This one might be out of scope for the alerting team and we might need to negotiate this one with platform team....However, we do not want to have to manage the API keys and other complexities as we really enjoy the
callCluster
API :-)Current workaround:
We might be able to send in a http header flag to turn on compression and we might be able to configure system wide the timeouts.
Enable alerting/actions plugin by default ✅
Issue:
#50209Use case:
As a rule user, I will need rules always enabled and running.
Technical solution:
Turn on the alerting/actions flag by default and add them as requirements to the SIEM plugin. Note that alerting/actions requires TLS by default. The SIEM application will only be able to run signals in TLS mode because of the requirements of API keys for alerting/actions framework. Reference issue discussing this: #34339
Current workaround:
None. We ad-hoc turn it on in the codebase per developer and then don't check it in to not cause people to break who have not enabled alerting/actions in their
kibana.dev.yml
.Delete old API keys when rules are deleted ✅
Issue:
#45144
Use case:
As a rule user, I can spend lots of time creating rules in a playground environment and/or spend a lot of days/weeks creating real world rules which could be deleted later. This is particularly the case for security where different techniques and tactics are part of the evolving landscape. As I delete older rules, I would like to also remove older API keys as that leaves a large surface area per compliance models at companies.
Technical solution:
When you delete a rule, the underlying API key eventually guarantees to be deleted within a timeframe.
Current workaround:
None, as we can't distinguish which API key belongs to which rule. Users if asking us which ones they can delete over time we will not have a good answer on how to distinguish them so I do not think a workaround is possible.
Pass down space id and other parameters ✅
Issue:
#50522
Use case:
As a rule user I want to copy rules from space to space and have the new space automatically provision the new space index and begin pushing signals to the new space suffixed data index. Right now, it will continue to point to the old index until someone from support gives me a script to update the
outputIndex
since it is hard coded at creation time of the rule.Technical solution:
Pass down the space id and any other useful parameters to the executor so the executor can detect a space change and try to either push errors or compensate by auto creating the index.
Current workaround:
We hard code the space id on rule creation in the outputIndex field and if we move rules around we have to update the rules manually.
The text was updated successfully, but these errors were encountered: