Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OSD-6646: Wait for ready job before configuring PD #147

Merged
merged 1 commit into from
Mar 18, 2021

Conversation

2uasimojo
Copy link
Member

@2uasimojo 2uasimojo commented Mar 17, 2021

This commit coordinates with the osd-cluster-ready Job to avoid throwing alerts while the cluster is provisioning. We configure DMS, but wait to configure PagerDuty until that Job has completed (success or failure). We wait a maximum of two hours (configurable via the MAX_CLUSTER_AGE_MINUTES env var) for the Job to complete, at which point we configure PagerDuty anyway.

OSD-6646

This commit coordinates with the osd-cluster-ready Job to avoid throwing
alerts while the cluster is provisioning. We configure DMS, but wait to
configure PagerDuty until that Job has completed (success or failure).
We wait a maximum of two hours (configurable via the
`MAX_CLUSTER_AGE_MINUTES` env var) for the Job to complete, at which
point we configure PagerDuty anyway.

OSD-6646
@codecov
Copy link

codecov bot commented Mar 17, 2021

Codecov Report

Merging #147 (3546968) into master (f952dd2) will decrease coverage by 9.41%.
The diff coverage is 11.11%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #147      +/-   ##
==========================================
- Coverage   64.54%   55.12%   -9.42%     
==========================================
  Files           8        9       +1     
  Lines         471      566      +95     
==========================================
+ Hits          304      312       +8     
- Misses        153      240      +87     
  Partials       14       14              
Impacted Files Coverage Δ
pkg/readiness/cluster_ready.go 0.00% <0.00%> (ø)
pkg/controller/secret/secret_controller.go 88.03% <68.75%> (-0.89%) ⬇️

2uasimojo added a commit to 2uasimojo/osd-cluster-ready that referenced this pull request Mar 17, 2021
This commit removes all "max cluster age" and silencing logic, which will
[henceforth be done in configure-alertmanager-operator](openshift/configure-alertmanager-operator#147).

OSD-6646
@jharrington22
Copy link
Contributor

/lgtm

Major difference here with what we are currently doing in osd-cluster-ready is that we're now querying prom using Go libs which is great and looks ok to me.

@2uasimojo testing in stage shows this is ok? Have you been able to test the MAX_CLUSTER_AGE_MINUTES threshold?

@jharrington22
Copy link
Contributor

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Mar 18, 2021
@jewzaam
Copy link
Member

jewzaam commented Mar 18, 2021

/approve

@openshift-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: 2uasimojo, jewzaam, jharrington22

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 18, 2021
@openshift-merge-robot openshift-merge-robot merged commit 55f6f6a into openshift:master Mar 18, 2021
@2uasimojo
Copy link
Member Author

@2uasimojo testing in stage shows this is ok? Have you been able to test the MAX_CLUSTER_AGE_MINUTES threshold?

Yes and yes. All the code paths are 👍

@2uasimojo 2uasimojo deleted the OSD-6646-2 branch March 18, 2021 15:57
2uasimojo added a commit to 2uasimojo/managed-cluster-config that referenced this pull request Mar 18, 2021
Since
openshift/configure-alertmanager-operator#147,
configure-alertmanager-operator needs to bind to the
`cluster-monitoring-view` ClusterRole in order to talk to prometheus.
However, due to OLM limitations, that ClusterRoleBinding can't be
deployed with c-am-o's CSV. Therefore we deploy it from here.

OSD-6646
2uasimojo added a commit to 2uasimojo/managed-cluster-config that referenced this pull request Mar 18, 2021
Since
openshift/configure-alertmanager-operator#147,
configure-alertmanager-operator needs to bind to the
`cluster-monitoring-view` ClusterRole in order to talk to prometheus.
However, due to OLM limitations, that ClusterRoleBinding can't be
deployed with c-am-o's CSV. Therefore we deploy it from here.

OSD-6646
2uasimojo added a commit to 2uasimojo/configure-alertmanager-operator that referenced this pull request Mar 18, 2021
Since openshift#147, c-am-o needs to bind to the cluster-monitoring-view
ClusterRole in order to be able to talk to prometheus. However, due to
OLM limitations, we can't ship ClusterRoleBindings within the CSV or
bundle. So instead ship it along with the OLMisms.

OSD-6646
2uasimojo added a commit to 2uasimojo/configure-alertmanager-operator that referenced this pull request Mar 18, 2021
Since openshift#147, c-am-o needs to bind to the cluster-monitoring-view
ClusterRole in order to be able to talk to prometheus. However, due to
OLM limitations, we can't ship ClusterRoleBindings within the CSV or
bundle. So instead ship it along with the OLMisms.

OSD-6646
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants