Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alerts for failing OLM operators #1104

Merged

Conversation

anik120
Copy link
Contributor

@anik120 anik120 commented Oct 31, 2019

Description of the change:

  • Add prometheus rule for firing an alert when csv_abnormal metric
    is emitted with phase=failed
  • Alert message contains name and version of operator.

Motivation for the change:

As a OpenShift administrator, I want to be able to receive a notification when an operator is consistently reaching a failed state so that I can immediately address known issues.

Enhancement proposal

Reviewer Checklist

  • Implementation matches the proposed design, or proposal is updated to match implementation
  • Sufficient unit test coverage
  • Sufficient end-to-end test coverage
  • Docs updated or added to /docs
  • Commit messages sensible and descriptive

@openshift-ci-robot openshift-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Oct 31, 2019
@anik120 anik120 force-pushed the add-promethues-rule branch 4 times, most recently from b84eed6 to 64525b5 Compare October 31, 2019 20:25
Copy link
Member

@ecordell ecordell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@openshift-ci-robot openshift-ci-robot added lgtm Indicates that a PR is ready to be merged. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Oct 31, 2019
Copy link
Member

@awgreene awgreene left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work on this @anik120! One issue that needs to be addressed, but looks good otherwise!

deploy/chart/templates/0000_90_olm_01-prometheus-rule.yaml Outdated Show resolved Hide resolved
@ecordell
Copy link
Member

ecordell commented Nov 1, 2019

/retest

@anik120
Copy link
Contributor Author

anik120 commented Nov 1, 2019

/test e2e-aws-olm

@anik120 anik120 force-pushed the add-promethues-rule branch from 64525b5 to f358251 Compare November 1, 2019 13:26
@openshift-ci-robot openshift-ci-robot removed the lgtm Indicates that a PR is ready to be merged. label Nov 1, 2019
@anik120 anik120 force-pushed the add-promethues-rule branch 3 times, most recently from 3ca37d4 to bf784ab Compare November 1, 2019 14:22
@anik120 anik120 force-pushed the add-promethues-rule branch from bf784ab to 1a68799 Compare November 1, 2019 14:36
@anik120
Copy link
Contributor Author

anik120 commented Nov 1, 2019

/retest

1 similar comment
@awgreene
Copy link
Member

awgreene commented Nov 1, 2019

/retest

- Add prometheus rule for firing an alert when csv_abnormal metric
  is emitted with phase=failed
- Alert message contains name and version of operator.
@anik120 anik120 force-pushed the add-promethues-rule branch from 1a68799 to 0d3a9c2 Compare November 1, 2019 16:30
@ecordell
Copy link
Member

ecordell commented Nov 1, 2019

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Nov 1, 2019
@awgreene
Copy link
Member

awgreene commented Nov 1, 2019

/lgtm
/retest

@openshift-ci-robot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: anik120, awgreene, ecordell

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@anik120
Copy link
Contributor Author

anik120 commented Nov 1, 2019

/test e2e-aws-console-olm

1 similar comment
@anik120
Copy link
Contributor Author

anik120 commented Nov 1, 2019

/test e2e-aws-console-olm

@openshift-merge-robot openshift-merge-robot merged commit a884018 into operator-framework:master Nov 1, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants