Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Maintenance window is not working on index threshold rule #163968

Closed
bhavyarm opened this issue Aug 15, 2023 · 8 comments
Closed

Maintenance window is not working on index threshold rule #163968

bhavyarm opened this issue Aug 15, 2023 · 8 comments
Labels
bug Fixes for quality problems that affect the customer experience Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams)

Comments

@bhavyarm
Copy link
Contributor

bhavyarm commented Aug 15, 2023

Kibana version: 8.9.1 BC1

Elasticsearch version: 8.9.1 BC1

Browser version: chrome latest

Browser OS version: OS X

Original install method (e.g. download page, yum, from source, etc.): on cloud

Describe the bug: Maintenance window is not suppressing rule notifications on index threshold rule.

Here is my threshold rule:

Screenshot 2023-08-15 at 3 08 19 PM Screenshot 2023-08-15 at 3 08 25 PM Screenshot 2023-08-15 at 3 08 31 PM

Rule history. Note the active actions between 2.59 pm EST to 3.05 pm EST:
Screenshot 2023-08-15 at 3 09 31 PM

And here is my maintenance window:
maintenance_window

Maintenance window column on rule page. You can see the value between 2.59 pm EST to 3.05 pm est:
Screenshot 2023-08-15 at 3 23 57 PM
Screenshot 2023-08-15 at 3 24 18 PM

And here are my slack alerts for the suppressed time in the maintenance window:
3:01
alert 'stack_index_01' is active for group 'all documents':

  • Value: 4
  • Conditions Met: count is greater than 2 over 1h
  • Timestamp: 2023-08-15T19:00:58.029Z
    3:02
    alert 'stack_index_01' is active for group 'all documents':
  • Value: 4
  • Conditions Met: count is greater than 2 over 1h
  • Timestamp: 2023-08-15T19:02:01.065Z
    3:03
    alert 'stack_index_01' is active for group 'all documents':
  • Value: 4
  • Conditions Met: count is greater than 2 over 1h
  • Timestamp: 2023-08-15T19:03:01.230Z
    3:04
    alert 'stack_index_01' is active for group 'all documents':
  • Value: 4
  • Conditions Met: count is greater than 2 over 1h
  • Timestamp: 2023-08-15T19:04:04.222Z
    3:05
    alert 'stack_index_01' is active for group 'all documents':
  • Value: 4
@bhavyarm bhavyarm added bug Fixes for quality problems that affect the customer experience Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) labels Aug 15, 2023
@elasticmachine
Copy link
Contributor

Pinging @elastic/response-ops (Team:ResponseOps)

@JiaweiWu
Copy link
Contributor

Just had a call with @bhavyarm and @doakalexi to debug this, we determine that this wasn't a bug since according to the maintenance window documentation (https://www.elastic.co/guide/en/kibana/8.9/maintenance-windows.html)

When an alert occurs before a maintenance window and recovers during or after the maintenance window, notifications are sent as usual.

The alert that was used to test was initially created outside of the maintenance window. According to the docs, it should continue to send notifications. Once we tested the case where the alert was created during the maintenance window, we were able to verify that it did not send any notifications and that it had the maintenance ID associated with the alert.

@bhavyarm brought up a good point that this might become a point of confusion for users, perhaps we should directly link the documentation somewhere in the maintenance window page? @shanisagiv1

@bhavyarm
Copy link
Contributor Author

@bhavyarm brought up a good point that this might become a point of confusion for users, perhaps we should directly link the documentation somewhere in the maintenance window page? @shanisagiv1

Yes please. There are no extra details on maintenance window - other than - suppress notifications. In place documentation will help the user. Thanks!

@shanisagiv1
Copy link

Thanks for debugging and verifying that there is no issue.

Do you think the current note in the UI isn't sufficient? We definitely can improve it or link to docs.

@JiaweiWu can you add to this UI note : "Learn more" and link to the public guide?
cc @XavierM

Screenshot 2023-08-16 at 10 21 57

@maryam-saeidi
Copy link
Member

maryam-saeidi commented Aug 16, 2023

@shanisagiv1 When I was reviewing the maintenance window banner PR, I also had a similar expectation to see the maintenance window for any active during that period. Out of curiosity, was this decision about only affecting new alerts because of technical limitations, or other reasons?

Also, in the current implementation, it does not matter if the alert's rule has a connector and any notification is actually impacted during that time window. Wouldn't it be better to add the MW to an alert only when a notification is impacted by that MW? --> Answered here

@shanisagiv1
Copy link

@maryam-saeidi
Do you refer to alerts that were created, recovered, and change to active again during the MW right?
For those situations, we wanted to take the less ״aggresive״ approach if it makes sense. When MW is active, IT teams are doing proactive activities in systems that might generate false alarms. this was the main intention. when an existing alert becomes active again it might happen for different reasons, and by marking it as MW, the IR team might miss here a real active alert.

to your second question- replied here. hope it makes sense

@maryam-saeidi
Copy link
Member

@shanisagiv1 Yes, that answers my question, thank you :)

@JiaweiWu
Copy link
Contributor

Closing in favour of #164481

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Fixes for quality problems that affect the customer experience Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams)
Projects
No open projects
Development

No branches or pull requests

5 participants