-
Notifications
You must be signed in to change notification settings - Fork 92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Send PrometheusNotIngestingSamples for openshift-user-workload-monitoring to null receiver #142
Send PrometheusNotIngestingSamples for openshift-user-workload-monitoring to null receiver #142
Conversation
…ring to null receiver
Love this. +1 from me |
Codecov Report
@@ Coverage Diff @@
## master #142 +/- ##
==========================================
+ Coverage 64.16% 64.23% +0.07%
==========================================
Files 8 8
Lines 466 467 +1
==========================================
+ Hits 299 300 +1
Misses 153 153
Partials 14 14
|
@fahlmant how do we remember to revert this when 4.7 comes around? |
@jharrington22 We can make a card and put it in the backlog. We'll have to wait until 4.6 is all gone so it will be a while. |
Lets add a card and I'll +1 this. |
@@ -148,6 +148,9 @@ func createPagerdutyRoute() *alertmanager.Route { | |||
// https://issues.redhat.com/browse/OSD-6327 | |||
{Receiver: receiverNull, Match: map[string]string{"alertname": "CannotRetrieveUpdates"}}, | |||
|
|||
//https://issues.redhat.com/browse/OSD-6559 | |||
{Receiver: receiverNull, Match: map[string]string{"alertname": "PrometheusNotIngestingSamples", "namespace": "openshift-user-workload-monitoring"}}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we get any alerts for this namespace? I expect we need to make sure UWM is up, but is there a specific set of alerts we should allow and discard the rest?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're responsible for ensuring the resources in this namespace (prom pods, thanos pods) are running and working properly. Maybe we can make a list of Alerts we care about and scope it to that. For the short term, this will generate noise for every deployment by default, so we should silence this, then reevaluate other alerts from this NS IMO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack. Let's card up something to address this more long term but understood there's a tactical need.
/approve
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: fahlmant, jewzaam, jharrington22 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
In 4.6, this alert fires by default as UWM has no scrape targets. Fixed in 4.7 but monitoring team recommended silencing the alerts.
https://issues.redhat.com/browse/OSD-6559