You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm running alertmanger 0.21. I have a repeat interval set for different routes and matches. Therepeat_intervalis set to 365d. However, when I get a few alerts in a FIRING state, at a completely random time, they start firing again. Seems like alertmanager is not taking into consideration my repeat_interval. The alertname is always the same. The FIRING time does not change, when there is a repeated send.
What did you expect to see?
Alerts not repeated before 365 days
What did you see instead? Under which circumstances?
Alerts are repeated at random times. Once per few days
Alright so I figured it out. For anyone struggling with a repeat_interval longer than a few days, this breaks due to --data.retention parameter of alertmanager. The default data.retention is 120h , or 5 days. In my case, alertmanager "forgets" the repeat_interval because an alert that was already FIRING gets fired again after the data.retention is executed. This is due to having data.retentionpoorly documented. I know that there is a warning, but this warning means nothing as to what can be expected, and it can be hardly noticable. I highly recommend to add info about data.retention when talking about repeat_interval in Prometheus/Alertmanager's documentation. It will save people alot of trouble.
I'm running alertmanger
0.21
. I have a repeat interval set for different routes and matches. Therepeat_interval
is set to365d
. However, when I get a few alerts in aFIRING
state, at a completely random time, they start firing again. Seems like alertmanager is not taking into consideration myrepeat_interval
. The alertname is always the same. The FIRING time does not change, when there is a repeated send.What did you expect to see?
Alerts not repeated before 365 days
What did you see instead? Under which circumstances?
Alerts are repeated at random times. Once per few days
Environment
production
0.21.0
2.25.0
The last line of the log should not occur. The server has been down for ~10 days.
The text was updated successfully, but these errors were encountered: