[Security Solution][Detections]Indicator Match Enrichment #89899

rylnd · 2021-02-01T17:36:05Z

Summary

This work adds enrichment to alerts generated by indicator match detection rules.

Features

All indicator match alerts will be enriched with threat.indicator.matched, containing metadata about the circumstances of that event(s)/indicator(s) being matched.
If the indicator index conform to the prospective ECS threat intel fields, those will automatically be added to the resulting detection alert.

Followup to this PR:

Add configuration to allow signals to be enriched with a non-ECS fieldset (e.g. the new filebeat modules use threatintel.indicator while the ECS fields are being finalized)
Outstanding issues with nested field support:
- timeline: [Security Solution][Timeline] Dragging a nested field to timeline does not generate the correct query #89784
- JSON view: [Security Solution] Event/Alert Details JSON View displays nested objects inaccurately #90355

Reviewer Notes:

In general: create an indicator index, create a rule, generate some events, and observe the resultant, enriched signals. Specifically:

Generate an indicator index. We can use es_archive to loading the mappings/data used by the integration tests in this PR:

node scripts/es_archiver.js load threat_intel --dir x-pack/test/functional/es_archives/filebeat --es-url http://elastic:changeme@localhost:9200 --kibana-url http://elastic:changeme@localhost:5601

Create some indicator data. The simplest approach here is to create an indicator(s) with some known value (e.g. your computer's hostname) that we can leverage later. In dev tools:

POST filebeat-8.0.0-2021.01.26-000001/_doc/
{
  "@timestamp": "2021-02-01T00:41:06.527Z",
  "threat": {
    "indicator": {
      "domain": "my-laptop-hostname.local",
      "provider": "local-testing",


  "other_fields": "with values you want to see in the signal"
    }
  }
}

Generate source events with values matching your indicator fields above. As an example, I run auditbeat, knowing that it will generate events with our laptop's hostname in host.name. However, you can also manually insert some documents into an index of your own creation; just ensure that they contain values that will cause the rule to generate a signal.
Create an indicator match rule against your indicator index and event data:

Peep those enriched signals:
Note: we're temporarily using dev tools here due to some ongoing fields display issues, and the _source filter is for demo cleanliness, please remove

GET .siem-signals-default/_search
{
  "query": {
    "term": {
      "signal.rule.name": {
        "value": "Review Test"
      }
    }
  },
  "_source": "threat.indicator"
}

Checklist

Any text added follows EUI's writing guidelines, uses sentence case text and includes i18n support
Documentation was added for features that require explanation or tutorials
Unit or functional tests were updated or added to match the most common scenarios

For maintainers

This was checked for breaking API changes and was labeled appropriately

When this moves to individual rule-specific data transformations this will be a little more explicit/configurable; for now to keep changes minimal, we're using dependency injection to pass a function, which will default to the identity function (e.g. a no-op).

This is what allows us to enrich the threat match signals using only the signal search response.

This gives us the information we need to enrich our signals after they've been queried without having to perform a complicated reverse query.

Adds assertions to the existing test, and fleshes out another test for a multi-match signal.

* single indicator matching multiple events * multiple indicators matching a single event * multiple indicators, multiple events * placeholder for deduplication logic This also adds some descriptions to our threat intel documents, to give a little context around how they're meant to function within the tests, particularly as relates to the auditbeat/hosts data on which it is meant to function.

This handles the situation where the indicator match search has returned the same signal multiple times due to the source event matching different indicators in different query batches. In this case, we want to generate a single signal with all matched indicators.

We were previously adding the indicator's field to matched.field, instead of the corresponding event field that matched the indicator. In the normal case, the expectation is that the indicator field is self-evident, and thus we want to know the other side of the match on the event itself. Updates tests accordingly.

This could occur if the indicator index is updated while a rule is being run.

This just verifies that the enrichment function gets invoked with search results.

rylnd · 2021-02-05T04:03:37Z

x-pack/plugins/security_solution/server/lib/detection_engine/routes/index/ecs_mapping.json

@@ -2457,6 +2457,144 @@
            "ignore_above": 1024,
            "type": "keyword"
          },
+          "indicator": {


These need to be updated with the latest from elastic/ecs#1127 before this gets merged/released.

Update: as of today, these are up to date with the RFC.

I made both of these before we were clear on the direction we were taking here.

FrankHassanabad · 2021-02-09T23:23:46Z

x-pack/test/functional/es_archives/filebeat/threat_intel/data.json

+      }
+    }
+  }
+}


As talked about over video conference let's see if we can trim this and the mappings down to their essence and only have what we absolutely need in these for testing.

This will make maintainers lives easier and make things easier to understand. If someone needs to test other portions or parts of threat intelligence we should encourage them to create and use different index mappings and data sets specific to those tests and avoid mixing extra baggage between tests.

It seems to be the better way for this to all work.

Agreed; I will trim down this file, and also create a smaller auditbeat index containing just the few events that we care about here (plus 1-2 extras to ensure no false positives). In a followup PR 😜

FrankHassanabad · 2021-02-09T23:26:01Z

x-pack/test/functional/es_archives/filebeat/threat_intel/mappings.json

+        },
+        "number_of_replicas": "0",
+        "number_of_shards": "1",
+        "refresh_interval": "5s"


This is interesting though. I think the refresh_interval should stay at 5s for most correctness tests. I might clean up my other tests with a 5s refresh interval to reduce flakiness and increase the amount of time for things to be showing up within the tests.

This wasn't a conscious choice, but was instead cribbed from another existing archive. I think we want as small an interval as possible, no? I see values of 5s, 1ms, and -1 in the existing x-pack FTR archives.

...ck/plugins/security_solution/server/lib/detection_engine/signals/search_after_bulk_create.ts

...ck/test/detection_engine_api_integration/security_and_spaces/tests/create_threat_matching.ts

...ecurity_solution/server/lib/detection_engine/signals/threat_mapping/create_threat_signals.ts

Ensure that we throw an error if the indicator field is either a primitive or an array of primitives.

These values are already defaulted in the parent, and the types are correct in that these cannot be undefined.

existingSignalHit could not be undefined on line 30 here, but typescript could not infer this from the !acc.has() call.

We were using a map previously in order to use .has() for a predicate, but code has since been refactored to make that unnecessary.

These are being typed implicitly and verified against SignalSourceHit[] on the assignment below, but this makes the types explicit and surfaces a type error here instead of the subsequent assignment.

rylnd · 2021-02-10T22:39:02Z

@elasticmachine merge upstream

These references were moved into buildThreatEnrichment

I copied the entirety of the `threat` mappings in order to get the `threat.indicator` ones, but it looks like these were added at some point too. I'd rather these not be added incidentally. If we need them, we should do so explicitly.

kibanamachine · 2021-02-12T00:43:01Z

💚 Build Succeeded

continuous-integration/kibana-ci/pull-request
Commit: bf35b55

Metrics [docs]

✅ unchanged

History

💚 Build #105871 succeeded 1b973fc
💔 Build #105830 failed ea0bbdd
💔 Build #105823 failed 0d8223f
💚 Build #104940 succeeded a96c773
💚 Build #104855 succeeded d526dfc

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

) * Adds basic integration test for threat enrichment * Update signals mappings with indicator fields * Simplify some ternaries with Math.min * Remove outdated comments * Add notes from walkthrough with devin * Add an enrichment hook to the current signal creation pipeline When this moves to individual rule-specific data transformations this will be a little more explicit/configurable; for now to keep changes minimal, we're using dependency injection to pass a function, which will default to the identity function (e.g. a no-op). * Add utility functions for encoding/decoding our threat query This is what allows us to enrich the threat match signals using only the signal search response. * Add a name to each threat match filter clause This gives us the information we need to enrich our signals after they've been queried without having to perform a complicated reverse query. * Adds functions for signal enrichment of threat indicators * Wire up threat enrichment to threat match rules * Fleshes out threat match integration tests Adds assertions to the existing test, and fleshes out another test for a multi-match signal. * Add more test cases to indicator match integration tests * single indicator matching multiple events * multiple indicators matching a single event * multiple indicators, multiple events * placeholder for deduplication logic This also adds some descriptions to our threat intel documents, to give a little context around how they're meant to function within the tests, particularly as relates to the auditbeat/hosts data on which it is meant to function. * Implement signal deduplification This handles the situation where the indicator match search has returned the same signal multiple times due to the source event matching different indicators in different query batches. In this case, we want to generate a single signal with all matched indicators. * Move default indicator path to constant * Testing some edge cases with signal enrichment * Cover and test edge cases with threat enrichment generation * Fix logical error in TI enrichment We were previously adding the indicator's field to matched.field, instead of the corresponding event field that matched the indicator. In the normal case, the expectation is that the indicator field is self-evident, and thus we want to know the other side of the match on the event itself. Updates tests accordingly. * Document behavior when an indicator matched but is absent on enrichment This could occur if the indicator index is updated while a rule is being run. * Add followup note * Add basic unit test for our enrichment function This just verifies that the enrichment function gets invoked with search results. * Update license headers for new files * Remove unused threatintel archive I made both of these before we were clear on the direction we were taking here. * Bump signals version to allows some updates in patch releases * Fix typings of threat list item We were conflating the type of the underlying document with the type of the search response for that document. This is now addressed with two types: ThreatListDoc and ThreatListItem, respectively. ThreatListDoc isn't the most distinguishing name but it avoids a lot of unnecessary renaming for the existing concept of ThreatListItem. * Update test mock to be aware of (but not care about) named queries * Remove/update outdated comments This code was modified to perform two searches instead of one; at that time, a lot of this code was duplicated and modified slightly, and these misleading comments were a result. I removed the ones that were no longer relevant, but left a TODO for one that could be a bug. * Remove outdated comment Documents will always have _id. * Update enriched signals' total to account for deduplication If a given signal matched on multiple indicators in different loops of our indicator query, it may appear multiple times. Our enrichment performs the merging of those duplicated results, but did not previously update the response's total field to account for this. I don't believe that anything downstream is actually using this field and that we are instead operating on the length of hits and the response from the bulk create request, but this keeps things consistent in case that changes. * Remove development comments * Add JSDoc for our special template version constant * Remove outdated comments * Add an additional test permutation for error cases Ensure that we throw an error if the indicator field is either a primitive or an array of primitives. * Remove unnecessary coalescing These values are already defaulted in the parent, and the types are correct in that these cannot be undefined. * Move logic to build threat enrichment function into helper * Refactor code to allow typescript to infer our type narrowing existingSignalHit could not be undefined on line 30 here, but typescript could not infer this from the !acc.has() call. * Use a POJO over a Map We were using a map previously in order to use .has() for a predicate, but code has since been refactored to make that unnecessary. * Explicitly type our enriched signals These are being typed implicitly and verified against SignalSourceHit[] on the assignment below, but this makes the types explicit and surfaces a type error here instead of the subsequent assignment. * Add an explanatory note about these test results * Remove unused imports These references were moved into buildThreatEnrichment * Remove threat mappings accidentally brought in with indicator work I copied the entirety of the `threat` mappings in order to get the `threat.indicator` ones, but it looks like these were added at some point too. I'd rather these not be added incidentally. If we need them, we should do so explicitly. Co-authored-by: Kibana Machine <[email protected]>

…91264) * Adds basic integration test for threat enrichment * Update signals mappings with indicator fields * Simplify some ternaries with Math.min * Remove outdated comments * Add notes from walkthrough with devin * Add an enrichment hook to the current signal creation pipeline When this moves to individual rule-specific data transformations this will be a little more explicit/configurable; for now to keep changes minimal, we're using dependency injection to pass a function, which will default to the identity function (e.g. a no-op). * Add utility functions for encoding/decoding our threat query This is what allows us to enrich the threat match signals using only the signal search response. * Add a name to each threat match filter clause This gives us the information we need to enrich our signals after they've been queried without having to perform a complicated reverse query. * Adds functions for signal enrichment of threat indicators * Wire up threat enrichment to threat match rules * Fleshes out threat match integration tests Adds assertions to the existing test, and fleshes out another test for a multi-match signal. * Add more test cases to indicator match integration tests * single indicator matching multiple events * multiple indicators matching a single event * multiple indicators, multiple events * placeholder for deduplication logic This also adds some descriptions to our threat intel documents, to give a little context around how they're meant to function within the tests, particularly as relates to the auditbeat/hosts data on which it is meant to function. * Implement signal deduplification This handles the situation where the indicator match search has returned the same signal multiple times due to the source event matching different indicators in different query batches. In this case, we want to generate a single signal with all matched indicators. * Move default indicator path to constant * Testing some edge cases with signal enrichment * Cover and test edge cases with threat enrichment generation * Fix logical error in TI enrichment We were previously adding the indicator's field to matched.field, instead of the corresponding event field that matched the indicator. In the normal case, the expectation is that the indicator field is self-evident, and thus we want to know the other side of the match on the event itself. Updates tests accordingly. * Document behavior when an indicator matched but is absent on enrichment This could occur if the indicator index is updated while a rule is being run. * Add followup note * Add basic unit test for our enrichment function This just verifies that the enrichment function gets invoked with search results. * Update license headers for new files * Remove unused threatintel archive I made both of these before we were clear on the direction we were taking here. * Bump signals version to allows some updates in patch releases * Fix typings of threat list item We were conflating the type of the underlying document with the type of the search response for that document. This is now addressed with two types: ThreatListDoc and ThreatListItem, respectively. ThreatListDoc isn't the most distinguishing name but it avoids a lot of unnecessary renaming for the existing concept of ThreatListItem. * Update test mock to be aware of (but not care about) named queries * Remove/update outdated comments This code was modified to perform two searches instead of one; at that time, a lot of this code was duplicated and modified slightly, and these misleading comments were a result. I removed the ones that were no longer relevant, but left a TODO for one that could be a bug. * Remove outdated comment Documents will always have _id. * Update enriched signals' total to account for deduplication If a given signal matched on multiple indicators in different loops of our indicator query, it may appear multiple times. Our enrichment performs the merging of those duplicated results, but did not previously update the response's total field to account for this. I don't believe that anything downstream is actually using this field and that we are instead operating on the length of hits and the response from the bulk create request, but this keeps things consistent in case that changes. * Remove development comments * Add JSDoc for our special template version constant * Remove outdated comments * Add an additional test permutation for error cases Ensure that we throw an error if the indicator field is either a primitive or an array of primitives. * Remove unnecessary coalescing These values are already defaulted in the parent, and the types are correct in that these cannot be undefined. * Move logic to build threat enrichment function into helper * Refactor code to allow typescript to infer our type narrowing existingSignalHit could not be undefined on line 30 here, but typescript could not infer this from the !acc.has() call. * Use a POJO over a Map We were using a map previously in order to use .has() for a predicate, but code has since been refactored to make that unnecessary. * Explicitly type our enriched signals These are being typed implicitly and verified against SignalSourceHit[] on the assignment below, but this makes the types explicit and surfaces a type error here instead of the subsequent assignment. * Add an explanatory note about these test results * Remove unused imports These references were moved into buildThreatEnrichment * Remove threat mappings accidentally brought in with indicator work I copied the entirety of the `threat` mappings in order to get the `threat.indicator` ones, but it looks like these were added at some point too. I'd rather these not be added incidentally. If we need them, we should do so explicitly. Co-authored-by: Kibana Machine <[email protected]> Co-authored-by: Kibana Machine <[email protected]>

rylnd added release_note:enhancement v8.0.0 v7.12.0 Team:Detections and Resp Security Detection Response Team labels Feb 1, 2021

rylnd self-assigned this Feb 1, 2021

rylnd added 21 commits February 3, 2021 23:12

Adds basic integration test for threat enrichment

2d402b8

Update signals mappings with indicator fields

876b49d

Simplify some ternaries with Math.min

9f9922d

Remove outdated comments

9620fa8

Add notes from walkthrough with devin

2211110

Add utility functions for encoding/decoding our threat query

0b1dd8b

This is what allows us to enrich the threat match signals using only the signal search response.

Add a name to each threat match filter clause

732a525

This gives us the information we need to enrich our signals after they've been queried without having to perform a complicated reverse query.

Adds functions for signal enrichment of threat indicators

8fdf571

Wire up threat enrichment to threat match rules

9712250

Fleshes out threat match integration tests

7936cb0

Adds assertions to the existing test, and fleshes out another test for a multi-match signal.

Move default indicator path to constant

0d256c6

Testing some edge cases with signal enrichment

2da0633

Cover and test edge cases with threat enrichment generation

e120609

Document behavior when an indicator matched but is absent on enrichment

aaec432

This could occur if the indicator index is updated while a rule is being run.

Add followup note

3e31d6a

Add basic unit test for our enrichment function

537fa3c

This just verifies that the enrichment function gets invoked with search results.

Update license headers for new files

12a0c87

rylnd force-pushed the threat_enrichment_simple branch from 3f84f0e to 12a0c87 Compare February 4, 2021 05:14

rylnd commented Feb 5, 2021

View reviewed changes

rylnd added 2 commits February 4, 2021 22:04

Remove unused threatintel archive

5675a78

I made both of these before we were clear on the direction we were taking here.

Bump signals version to allows some updates in patch releases

9481f81

FrankHassanabad reviewed Feb 9, 2021

View reviewed changes

...ck/plugins/security_solution/server/lib/detection_engine/signals/search_after_bulk_create.ts Outdated Show resolved Hide resolved