-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Security Solution][Detections] Fix flaky indicator enrichment tests #94241
Conversation
Due to the fact that we use named queries to determine matches, and the fact that the order in which named queries are returned is undefined, we cannot guarantee a consistent ordering of enrichments if a given event matches multiple named queries. Because the ordering is not in itself important to enrichment, in order to assert the multi-match functionality we must make the assertions order independent.
Pinging @elastic/security-solution (Team: SecuritySolution) |
Pinging @elastic/security-detections-response (Team:Detections and Resp) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for making our tests better!
LGTM, I just left 2 small comments which shouldn't block this PR though. 👍
// Asserts that each expected value is included in the subject, independent of | ||
// ordering. Uses _.isEqual for value comparison. | ||
const assertContains = (subject: unknown[], expected: unknown[]) => | ||
expected.map((expectedValue) => | ||
expect(subject.some((value) => isEqual(value, expectedValue))).to.eql( | ||
true, | ||
`expected ${format(subject)} to contain ${format(expectedValue)}` | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Super nit: expected.map
could be replaced by expected.forEach
assertContains(threat.indicator, [ | ||
{ | ||
indicator: [ | ||
{ | ||
description: 'this should match auditbeat/hosts on both port and ip', | ||
first_seen: '2021-01-26T11:06:03.000Z', | ||
ip: '45.115.45.3', | ||
matched: { | ||
atomic: '45.115.45.3', | ||
id: '978785', | ||
index: 'filebeat-8.0.0-2021.01.26-000001', | ||
field: 'source.ip', | ||
type: 'url', | ||
}, | ||
port: 57324, | ||
provider: 'geenensp', | ||
type: 'url', | ||
}, | ||
{ | ||
description: 'this should match auditbeat/hosts on ip', | ||
first_seen: '2021-01-26T11:06:03.000Z', | ||
ip: '45.115.45.3', | ||
matched: { | ||
atomic: '45.115.45.3', | ||
id: '978787', | ||
index: 'filebeat-8.0.0-2021.01.26-000001', | ||
field: 'source.ip', | ||
type: 'ip', | ||
}, | ||
provider: 'other_provider', | ||
type: 'ip', | ||
}, | ||
], | ||
description: 'this should match auditbeat/hosts on both port and ip', | ||
first_seen: '2021-01-26T11:06:03.000Z', | ||
ip: '45.115.45.3', | ||
matched: { | ||
atomic: '45.115.45.3', | ||
id: '978785', | ||
index: 'filebeat-8.0.0-2021.01.26-000001', | ||
field: 'source.ip', | ||
type: 'url', | ||
}, | ||
port: 57324, | ||
provider: 'geenensp', | ||
type: 'url', | ||
}, | ||
{ | ||
description: 'this should match auditbeat/hosts on ip', | ||
first_seen: '2021-01-26T11:06:03.000Z', | ||
ip: '45.115.45.3', | ||
matched: { | ||
atomic: '45.115.45.3', | ||
id: '978787', | ||
index: 'filebeat-8.0.0-2021.01.26-000001', | ||
field: 'source.ip', | ||
type: 'ip', | ||
}, | ||
provider: 'other_provider', | ||
type: 'ip', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I got it right, these tests load some events from auditbeat/hosts
ES archive, create a rule, generate signals and then check that signals have certain fields (set by the enrichment process in the rule?) like matched.*
.
I think my question is: how stable or brittle these tests are in terms of input data (events being tested)? How likely it is if someone updates this ES archive with new events, and these tests will break with false negatives although the enrichment functionality stays correctly implemented?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question! My understanding is that those archives are stable, and that for e.g. auditbeat-7.13 one would create a new archive/tests rather than update the existing one.
The tests are definitely brittle in terms of these data changing, but I believe that most if not all of our event-based integration tests share this property. Most of the other "create rule, assert on alerts" tests like this are reliant on the same archives, it just happened that these tests were brittle due to the nondeterministic ordering of named queries.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding is that those archives are stable, and that for e.g. auditbeat-7.13 one would create a new archive/tests rather than update the existing one.
Oh sorry, I was assuming they could be updated from time to time. :) Good to know that it's not the case. Thanks for your explanation @rylnd 👍
* Since we're only looping for side effects, prefer forEach to map for more idiomatic FP.
💚 Build SucceededMetrics [docs]
History
To update your PR or re-run it, just comment with: cc @rylnd |
…lastic#94241) * Make indicator enrichment tests order-independent Due to the fact that we use named queries to determine matches, and the fact that the order in which named queries are returned is undefined, we cannot guarantee a consistent ordering of enrichments if a given event matches multiple named queries. Because the ordering is not in itself important to enrichment, in order to assert the multi-match functionality we must make the assertions order independent. * PR feedback * Since we're only looping for side effects, prefer forEach to map for more idiomatic FP.
…lastic#94241) * Make indicator enrichment tests order-independent Due to the fact that we use named queries to determine matches, and the fact that the order in which named queries are returned is undefined, we cannot guarantee a consistent ordering of enrichments if a given event matches multiple named queries. Because the ordering is not in itself important to enrichment, in order to assert the multi-match functionality we must make the assertions order independent. * PR feedback * Since we're only looping for side effects, prefer forEach to map for more idiomatic FP.
…94241) (#94374) * Make indicator enrichment tests order-independent Due to the fact that we use named queries to determine matches, and the fact that the order in which named queries are returned is undefined, we cannot guarantee a consistent ordering of enrichments if a given event matches multiple named queries. Because the ordering is not in itself important to enrichment, in order to assert the multi-match functionality we must make the assertions order independent. * PR feedback * Since we're only looping for side effects, prefer forEach to map for more idiomatic FP.
…94241) (#94375) * Make indicator enrichment tests order-independent Due to the fact that we use named queries to determine matches, and the fact that the order in which named queries are returned is undefined, we cannot guarantee a consistent ordering of enrichments if a given event matches multiple named queries. Because the ordering is not in itself important to enrichment, in order to assert the multi-match functionality we must make the assertions order independent. * PR feedback * Since we're only looping for side effects, prefer forEach to map for more idiomatic FP.
…-action * 'master' of github.com:elastic/kibana: (43 commits) [Console] Update copy when showing warnings in response headers (#94270) [TSVB] Type public code. Step 1 (#93231) [ML] Functional tests - stabilize slider value selection (#94313) skip another suite blocking es promotion (#94367) [Security Solution] Eliminates a redundant external link icon (#94194) skip another suite blocking es promotion (#94367) [App Search] Role mappings migration part 1 (#94346) [Security Solution][Detections] Fix flaky indicator enrichment tests (#94241) [Workplace Search] Deduplicate icons (#94359) [ML] Add latest transform to intro text (#94039) skip test failing es promotion (#94367) [Maps] convert elasticsearch_utils to TS (#93984) [Security_Solution][Telemetry] - Update endpoint usage to use agentService (#93829) [Security Solution][Exceptions] Fixes OS adding method for exception enrichment (#94343) [ILM] Add support for frozen phase (#93068) [App Search] Fixed 2 relevance tuning bugs (#94312) remove `try` auth mode (#94287) Removing resolver functional tests (#94331) migrate warning mixin to core (#94273) [App Search] Add routes for Role Mappings (#94221) ... # Conflicts: # x-pack/plugins/index_lifecycle_management/public/application/sections/edit_policy/components/phases/cold_phase/cold_phase.tsx # x-pack/plugins/index_lifecycle_management/public/application/sections/edit_policy/components/phases/phase/phase.tsx # x-pack/plugins/index_lifecycle_management/public/application/sections/edit_policy/components/phases/shared_fields/searchable_snapshot_field/searchable_snapshot_field.tsx # x-pack/plugins/index_lifecycle_management/public/application/sections/edit_policy/edit_policy.tsx # x-pack/plugins/index_lifecycle_management/public/application/sections/edit_policy/form/configuration_issues_context.tsx
Summary
Addresses #93152.
Due to the fact that we use named queries to determine matches, and the fact that the order in which named queries are returned is undefined, (as observed in the output of the test runs here), we cannot guarantee a consistent ordering of enrichments if a given event matches multiple named queries.
Because the ordering is not in itself important to enrichment, in order to assert the multi-match functionality we must make the assertions order independent.
This robustness is verified via this build, where enrichment tests were run 20 times without failure. Prior to this change, ~50% of those runs would fail.
Checklist
For maintainers