Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Security Solution][Detections] Truncate lastFailureMessage for siem-detection-engine-rule-status documents #112257

Merged

Conversation

banderror
Copy link
Contributor

@banderror banderror commented Sep 15, 2021

Ticket: #109815

Summary

Background: siem-detection-engine-rule-status documents stores the lastFailureMessage a string which is indexed as type: "text" but some failure messages are so large that these documents are up to 26MB. These large documents cause migrations to fail because a batch of 1000 documents easily exceed Elasticsearch's http.max_content_length which defaults to 100mb.

This PR truncates lastFailureMessage and lastSuccessMessage in the following cases:

  1. When we write new or update existing status SOs:
    • The lists of errors/warnings are deduped -> truncated to max 20 items -> joined to a string
    • The resulting strings are truncated to max 10240 characters
  2. When we migrate siem-detection-engine-rule-status SOs to 7.15.2:
    • The two message fields are truncated to max 10240 characters

Checklist

Delete any items that are not applicable to this PR.

@banderror banderror added bug Fixes for quality problems that affect the customer experience impact:high Addressing this issue will have a high level of impact on the quality/strength of our product. Team:Detections and Resp Security Detection Response Team Team: SecuritySolution Security Solutions Team working on SIEM, Endpoint, Timeline, Resolver, etc. Team:Detection Rule Management Security Detection Rule Management Team labels Sep 15, 2021
@banderror banderror self-assigned this Sep 15, 2021
@banderror banderror requested review from xcrzx and spong September 15, 2021 12:47
@banderror banderror marked this pull request as ready for review September 15, 2021 12:52
@banderror banderror requested a review from a team as a code owner September 15, 2021 12:52
@elasticmachine
Copy link
Contributor

Pinging @elastic/security-detections-response (Team:Detections and Resp)

@elasticmachine
Copy link
Contributor

Pinging @elastic/security-solution (Team: SecuritySolution)

@spong spong added v7.15.1 and removed v7.14.2 labels Sep 15, 2021
@banderror banderror force-pushed the truncate-siem-detection-engine-rule-status branch from cd02610 to 5b25da2 Compare October 6, 2021 14:36
@banderror banderror marked this pull request as draft October 6, 2021 14:36
@banderror banderror added v7.15.2 and removed v7.15.1 labels Oct 14, 2021
@banderror banderror force-pushed the truncate-siem-detection-engine-rule-status branch from 5b25da2 to 8946be2 Compare October 14, 2021 06:17
@banderror banderror force-pushed the truncate-siem-detection-engine-rule-status branch from 8946be2 to 236009c Compare October 14, 2021 10:55
@banderror banderror marked this pull request as ready for review October 14, 2021 11:10
@banderror banderror added the auto-backport Deprecated - use backport:version if exact versions are needed label Oct 14, 2021
@banderror banderror requested review from xcrzx and rudolf October 14, 2021 11:52
@banderror
Copy link
Contributor Author

@rudolf @spong @xcrzx this fix is ready for the 2nd review! 🙏

@kibanamachine
Copy link
Contributor

💚 Build Succeeded

Metrics [docs]

✅ unchanged

History

  • 💔 Build #160783 failed 8946be27914d37e4a7062702050d212ebb1a942c
  • 💚 Build #158500 succeeded 5b25da2c384a37a5adcdeb7a13bb6b9dd177a74d
  • 💚 Build #153517 succeeded cd02610ceb83c1eecdcb4d5d762ea35a55573e6b

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

cc @banderror

@@ -47,11 +48,28 @@ export const ruleStatusSavedObjectMappings: SavedObjectsType['mappings'] = {
},
};

const truncateMessageFields: SavedObjectMigrationFn<Record<string, unknown>> = (doc) => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using Record<string, unknown> doesn't really add any type safety. Could you use IRuleStatusSOAttributes https://github.com/elastic/kibana/blob/master/x-pack/plugins/security_solution/server/lib/detection_engine/rules/types.ts#L113-L125

Don't have to block on this, but I think this is a good pattern to adopt for migrations.

Copy link
Contributor Author

@banderror banderror Oct 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general IRuleStatusSOAttributes would correspond to the current (target for this migration) interface, not to the previous interface (prior to migration). Although in this particular case no changes to IRuleStatusSOAttributes have been made, it doesn't mean we wouldn't change it in the future. Any substantial change to the TS interface in a future version would keep this function statically typed but it would not be safe anymore - we'd need to adjust the code of the migration function to conform to the changed interface, thus making the function prone to bugs. Maintaining history of these interfaces (like IRuleStatusSOAttributes_v15_1 etc) would help keeping it fully statically typed and correct, but we don't do this and effort/value ratio of this seems to be too high. Specifying target SO attributes (MigratedAttributes) as IRuleStatusSOAttributes is actually prone to the same issues.

I think unless we start maintaining a history of TS interfaces for the SO attributes, introducing type safety here actually means introducing risk of making migrations incorrect in the future.

Also, we'll likely get rid of this SO type very soon.

Copy link
Contributor

@xcrzx xcrzx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the changes, Georgii👍
We could address outstanding questions separately, this PR LGTM!

@banderror banderror merged commit b304c1c into elastic:master Oct 14, 2021
@banderror banderror deleted the truncate-siem-detection-engine-rule-status branch October 14, 2021 15:40
kibanamachine pushed a commit to kibanamachine/kibana that referenced this pull request Oct 14, 2021
…detection-engine-rule-status documents (elastic#112257)

**Ticket:** elastic#109815

## Summary

**Background:** `siem-detection-engine-rule-status` documents stores the `lastFailureMessage` a string which is indexed as `type: "text"` but some failure messages are so large that these documents are up to 26MB. These large documents cause migrations to fail because a batch of 1000 documents easily exceed Elasticsearch's `http.max_content_length` which defaults to 100mb.

This PR truncates `lastFailureMessage` and `lastSuccessMessage` in the following cases:

1. When we write new or update existing status SOs:
    - The lists of errors/warnings are deduped -> truncated to max `20` items -> joined to a string
    - The resulting strings are truncated to max `10240` characters
2. When we migrate `siem-detection-engine-rule-status` SOs to 7.15.2:
    - The two message fields are truncated to max `10240` characters

### Checklist

Delete any items that are not applicable to this PR.

- [ ] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios
@kibanamachine
Copy link
Contributor

💔 Backport failed

Status Branch Result
7.x
7.15 Commit could not be cherrypicked due to conflicts

Successful backport PRs will be merged automatically after passing CI.

To backport manually run:
node scripts/backport --pr 112257

@banderror
Copy link
Contributor Author

banderror commented Oct 14, 2021

I 👀 the failed backport, will manually make it.

UPD: #115166

kibanamachine added a commit that referenced this pull request Oct 14, 2021
…detection-engine-rule-status documents (#112257) (#115038)

**Ticket:** #109815

## Summary

**Background:** `siem-detection-engine-rule-status` documents stores the `lastFailureMessage` a string which is indexed as `type: "text"` but some failure messages are so large that these documents are up to 26MB. These large documents cause migrations to fail because a batch of 1000 documents easily exceed Elasticsearch's `http.max_content_length` which defaults to 100mb.

This PR truncates `lastFailureMessage` and `lastSuccessMessage` in the following cases:

1. When we write new or update existing status SOs:
    - The lists of errors/warnings are deduped -> truncated to max `20` items -> joined to a string
    - The resulting strings are truncated to max `10240` characters
2. When we migrate `siem-detection-engine-rule-status` SOs to 7.15.2:
    - The two message fields are truncated to max `10240` characters

### Checklist

Delete any items that are not applicable to this PR.

- [ ] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios

Co-authored-by: Georgii Gorbachev <[email protected]>
banderror added a commit to banderror/kibana that referenced this pull request Oct 15, 2021
…r siem-detection-engine-rule-status documents (elastic#112257)

**Ticket:** elastic#109815

**Background:** `siem-detection-engine-rule-status` documents stores the `lastFailureMessage` a string which is indexed as `type: "text"` but some failure messages are so large that these documents are up to 26MB. These large documents cause migrations to fail because a batch of 1000 documents easily exceed Elasticsearch's `http.max_content_length` which defaults to 100mb.

This PR truncates `lastFailureMessage` and `lastSuccessMessage` in the following cases:

1. When we write new or update existing status SOs:
    - The lists of errors/warnings are deduped -> truncated to max `20` items -> joined to a string
    - The resulting strings are truncated to max `10240` characters
2. When we migrate `siem-detection-engine-rule-status` SOs to 7.15.2:
    - The two message fields are truncated to max `10240` characters

Delete any items that are not applicable to this PR.

- [ ] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios
banderror added a commit that referenced this pull request Oct 15, 2021
…r siem-detection-engine-rule-status documents (#112257) (#115166)

**Ticket:** #109815

**Background:** `siem-detection-engine-rule-status` documents stores the `lastFailureMessage` a string which is indexed as `type: "text"` but some failure messages are so large that these documents are up to 26MB. These large documents cause migrations to fail because a batch of 1000 documents easily exceed Elasticsearch's `http.max_content_length` which defaults to 100mb.

This PR truncates `lastFailureMessage` and `lastSuccessMessage` in the following cases:

1. When we write new or update existing status SOs:
    - The lists of errors/warnings are deduped -> truncated to max `20` items -> joined to a string
    - The resulting strings are truncated to max `10240` characters
2. When we migrate `siem-detection-engine-rule-status` SOs to 7.15.2:
    - The two message fields are truncated to max `10240` characters

Delete any items that are not applicable to this PR.

- [ ] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-backport Deprecated - use backport:version if exact versions are needed bug Fixes for quality problems that affect the customer experience impact:high Addressing this issue will have a high level of impact on the quality/strength of our product. release_note:fix Team:Detection Rule Management Security Detection Rule Management Team Team:Detections and Resp Security Detection Response Team Team: SecuritySolution Security Solutions Team working on SIEM, Endpoint, Timeline, Resolver, etc. v7.15.2 v7.16.0 v8.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Truncate lastFailureMessage for siem-detection-engine-rule-status documents
6 participants