-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add alert rules to argo-controller based on the KF093 spec #195
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we are missing tests for this change. I believe the in the issue canonical/bundle-kubeflow#1026 we are talking about using this https://github.com/canonical/charmed-kubeflow-chisme/blob/b64f6ccca228d08c6533cbe83382306cd46670dc/src/charmed_kubeflow_chisme/testing/cos_integration.py#L413 for testing.
Can you also please provide minimal steps for reviewer to test it locally? What should I deploy and what should I check.
charms/argo-controller/src/prometheus_alert_rules/KubeflowArgoControllerServices.rules
Show resolved
Hide resolved
That's the recommended way. I created this and other PRs like this by script, so I did not check if charm implement the testing or not. That means that for some charms (like this one) we only update the alert rule and for some we add it. Do you think that we should include those test to all of them?
Something like deploy grafana-agent and check that alert rule is part of relation or deploy whole cos and check it in Grafana UI? WDYT? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM :)
* Add alert rules to argo-controller based on the KF093 spec * Delete charms/argo-controller/src/prometheus_alert_rules/unit_unavailable.rule
* Add alert rules to argo-controller based on the KF093 spec * Delete charms/argo-controller/src/prometheus_alert_rules/unit_unavailable.rule Co-authored-by: Robert Gildein <[email protected]>
These alert rules provide an overview of all service states.
Using the KubeflowServiceDown or KubeflowServiceIsNotStable filter, the user
can easily see the status of all Kubeflow services.
These changes can be tested by running the following commands:
part-of: #1026