Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Perform true partial updates when updating the alerting rule after execution #192397

Closed
2 of 3 tasks
mikecote opened this issue Sep 9, 2024 · 1 comment · Fixed by #193341
Closed
2 of 3 tasks

Perform true partial updates when updating the alerting rule after execution #192397

mikecote opened this issue Sep 9, 2024 · 1 comment · Fixed by #193341
Assignees
Labels
Feature:Alerting Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams)

Comments

@mikecote
Copy link
Contributor

mikecote commented Sep 9, 2024

When an alerting rule finishes running, it updates select fields within the rule saved-object to reflect the new timestamp it last run, the success/failure outcome, etc. The saved-object update method recently got changed to support downward-compatible updates (#152807) by performing a get + Kibana side update before calling index. These extra requests when running many rules add extra latency and I/O to Elasticsearch that we do not need to perform when updating rules at the end of their execution.

Definition of Done

  • Rules update themselves after a run by using the Elasticsearch client directly instead of the saved-objects client
  • Only the necessary fields are passed to the update function to perform a partial update
  • Preferably, other code using the partiallyUpdateRule function also leverage benefits from this work
@mikecote mikecote added Feature:Alerting Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) labels Sep 9, 2024
@elasticmachine
Copy link
Contributor

Pinging @elastic/response-ops (Team:ResponseOps)

ymao1 added a commit that referenced this issue Sep 30, 2024
…le run instead of SO client. (#193341)

Resolves #192397

## Summary

Updates alerting task runner end of run updates to use the ES client
update function for a true partial update instead of the saved objects
client update function that performs a GET then an update.

## To verify
Create a rule in multiple spaces and ensure they run correctly and their
execution status and monitoring history are updated at the end of each
run. Because we're performing a partial update on attributes that are
not in the AAD, the rule should continue running without any encryption
errors.

## Risk Matrix

| Risk | Probability | Severity | Mitigation/Notes |

|---------------------------|-------------|----------|-------------------------|
| Updating saved object directly using ES client will break BWC | Medium
| High | Response Ops follows an intermediate release strategy for any
changes to the rule saved object where schema changes are introduced in
an intermediate release before any changes to the saved object are
actually made in a followup release. This ensures that any rollbacks
that may be required in a release will roll back to a version that is
already aware of the new schema. The team is socialized to this strategy
as we are requiring users of the alerting framework to also follow this
strategy. This should address any backward compatibility issues that
might arise by circumventing the saved objects client update function. |
| Updating saved object directly using ES client will break AAD | Medium
| High | An explicit allowlist of non-AAD fields that are allowed to be
partially updated has been introduced and any fields not in this
allowlist will not be included in the partial update. Any updates to the
rule saved object that might break AAD would show up with > 1 execution
of a rule and we have a plethora of functional tests that rely on
multiple executions of a rule that would flag if there were issues
running due to AAD issues. |

---------

Co-authored-by: kibanamachine <[email protected]>
Co-authored-by: Elastic Machine <[email protected]>
kibanamachine pushed a commit to kibanamachine/kibana that referenced this issue Sep 30, 2024
…le run instead of SO client. (elastic#193341)

Resolves elastic#192397

## Summary

Updates alerting task runner end of run updates to use the ES client
update function for a true partial update instead of the saved objects
client update function that performs a GET then an update.

## To verify
Create a rule in multiple spaces and ensure they run correctly and their
execution status and monitoring history are updated at the end of each
run. Because we're performing a partial update on attributes that are
not in the AAD, the rule should continue running without any encryption
errors.

## Risk Matrix

| Risk | Probability | Severity | Mitigation/Notes |

|---------------------------|-------------|----------|-------------------------|
| Updating saved object directly using ES client will break BWC | Medium
| High | Response Ops follows an intermediate release strategy for any
changes to the rule saved object where schema changes are introduced in
an intermediate release before any changes to the saved object are
actually made in a followup release. This ensures that any rollbacks
that may be required in a release will roll back to a version that is
already aware of the new schema. The team is socialized to this strategy
as we are requiring users of the alerting framework to also follow this
strategy. This should address any backward compatibility issues that
might arise by circumventing the saved objects client update function. |
| Updating saved object directly using ES client will break AAD | Medium
| High | An explicit allowlist of non-AAD fields that are allowed to be
partially updated has been introduced and any fields not in this
allowlist will not be included in the partial update. Any updates to the
rule saved object that might break AAD would show up with > 1 execution
of a rule and we have a plethora of functional tests that rely on
multiple executions of a rule that would flag if there were issues
running due to AAD issues. |

---------

Co-authored-by: kibanamachine <[email protected]>
Co-authored-by: Elastic Machine <[email protected]>
(cherry picked from commit 05926c2)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature:Alerting Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams)
Projects
None yet
3 participants