-
Notifications
You must be signed in to change notification settings - Fork 8.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Alerting] Active alerts do not recover after re-enabling a rule #111671
[Alerting] Active alerts do not recover after re-enabling a rule #111671
Conversation
@elasticmachine merge upstream |
Pinging @elastic/kibana-alerting-services (Team:Alerting Services) |
@pmuellr do you think the additional event log flag could be introduced as a separate issue or it is possible to be added in the current scope? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! I verified this works as expected by
- creating a rule that will have active alerts, let it run
- disable the rule
- verify event log docs for
recovered-instance
are written for each active instance - edit the rule so that the condition will no longer be met
- enable the rule and see
Recovered
in the Rule Details view
I also tried
- creating a rule that will have active alerts, let it run
- disable the rule
- verify event log docs for
recovered-instance
are written for each active instance - enable the rule (without editing) and I see
Recovered
in the Rule Details view, but they quickly change back to Active.
I guess in the second scenario if the rule takes a while to execute, you would see Recovered
in the UI for the execution duration, which may be a little confusing? But I think that is ok.
Can we add functional tests for this?
x-pack/plugins/alerting/server/lib/create_alert_event_log_record_object.ts
Outdated
Show resolved
Hide resolved
x-pack/plugins/alerting/server/lib/create_alert_event_log_record_object.ts
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Works as expected
x-pack/plugins/alerting/server/lib/create_alert_event_log_record_object.test.ts
Outdated
Show resolved
Hide resolved
I assume a new property in the event log mappings? This would be something we add to the One thing we could do short-term to make these mostly searchable in the event log, would be to update the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM; love the new createAlertEventLogRecordObject()
so we can have a single place (or at least fewer places) to update when we add new fields ...
@elasticmachine merge upstream |
@elasticmachine merge upstream |
@elasticmachine merge upstream |
💚 Build Succeeded
Metrics [docs]
History
To update your PR or re-run it, just comment with: cc @YulNaumenko |
…-migrate-away-from-injected-css-js * 'master' of github.com:elastic/kibana: (237 commits) [Uptime] Added uptime query inspector panel (elastic#115170) [Osquery] Add packs (elastic#107345) [App Search] Allow for query parameter to indicate ingestion mechanism for new engines (elastic#115188) [Alerting] Active alerts do not recover after re-enabling a rule (elastic#111671) skip flaky tests. elastic#115308, elastic#115313 [Breaking] Remove deprecated `enabled` settings from plugins. (elastic#113495) skip flaky suite. elastic#107057 skip flaky tests. elastic#89052, elastic#113418, elastic#115304 skip flaky test. elastic#113892 Bump node to 16.11.1 (elastic#110684) [Security Solution] Restores Alerts table local storage persistence and the Remove Column action (elastic#114742) skip flaky suite. elastic#115130 one line remove assert (elastic#115127) Fixes migration bug where I was deleting attributes (elastic#115098) [Security Solutions] Fixes the newer notification system throttle resets and enabling immediate execution on first detection of a signal (elastic#114214) [build] Dockerfile update (elastic#115237) Fixes Cypress flake cypress test (elastic#115270) Disable APM e2e tests log an invalid type for SO (elastic#115175) [Fleet] Don't auto upgrade policies for AUTO_UPDATE packages (elastic#115199) ... # Conflicts: # src/plugins/dashboard/public/application/dashboard_app.tsx # src/plugins/dashboard/public/types.ts # x-pack/plugins/reporting/server/lib/layouts/print_layout.ts
…-link-to-kibana-app * 'master' of github.com:elastic/kibana: (287 commits) [Security Solution][Endpoint] Change `trustedAppByPolicyEnabled` flag to `true` by default (elastic#115264) [APM] generator: support error events and application metrics (elastic#115311) [kibanaUtils] Don't import full `semver` client side (elastic#114986) [RAC] Link inventory alerts to the right inventory view (elastic#113553) [Uptime] Added uptime query inspector panel (elastic#115170) [Osquery] Add packs (elastic#107345) [App Search] Allow for query parameter to indicate ingestion mechanism for new engines (elastic#115188) [Alerting] Active alerts do not recover after re-enabling a rule (elastic#111671) skip flaky tests. elastic#115308, elastic#115313 [Breaking] Remove deprecated `enabled` settings from plugins. (elastic#113495) skip flaky suite. elastic#107057 skip flaky tests. elastic#89052, elastic#113418, elastic#115304 skip flaky test. elastic#113892 Bump node to 16.11.1 (elastic#110684) [Security Solution] Restores Alerts table local storage persistence and the Remove Column action (elastic#114742) skip flaky suite. elastic#115130 one line remove assert (elastic#115127) Fixes migration bug where I was deleting attributes (elastic#115098) [Security Solutions] Fixes the newer notification system throttle resets and enabling immediate execution on first detection of a signal (elastic#114214) [build] Dockerfile update (elastic#115237) ... # Conflicts: # x-pack/plugins/reporting/public/management/__snapshots__/report_listing.test.tsx.snap
Summary
Partially resolves #110080
Based on the multiple discussions about, how alerting framework should
resolve
the alert instances in the case when the rule was disabled, the decision was made:to move forward with the recovery option for the short term, with a solution that will not trigger resolve-on-recovery for disabled alerts that are set up to resolve incidents on recovery
.Checklist