Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ROX-15234: Prometheus metric for pause-reconcile instances #1065

Merged
merged 3 commits into from
Jun 2, 2023

Conversation

vladbologa
Copy link
Contributor

@vladbologa vladbologa commented Jun 1, 2023

Description

This PR adds a Prometheus metric that exposes the pause-reconcile instances managed by fleetshard-sync.

The metric is implemented as a Gauge, where value 1 means an instance is pause-reconcile'd, and 0 means it's not. Values are labeled by the instance name.

See also stackrox/rhacs-observability-resources#102

Checklist (Definition of Done)

  • Unit and integration tests added
  • Added test description under Test manual
  • Documentation added if necessary (i.e. changes to dev setup, test execution, ...)
  • CI and all relevant tests are passing
  • Add the ticket number to the PR title if available, i.e. ROX-12345: ...
  • Discussed security and business related topics privately. Will move any security and business related topics that arise to private communication channel.
  • Add secret to app-interface Vault or Secrets Manager if necessary
  • RDS changes were e2e tested manually
  • Check AWS limits are reasonable for changes provisioning new resources

Test manual

Manual testing:

make deploy/dev
kubectl -n acsms port-forward fleetshard-sync-5479d64d89-v4mph 8080:8080

Then opened http://localhost:8080/metrics in a browser and checked that the new metrics are there and have correct values, before and after adding the pause-reconcile annotation, e.g.:

./scripts/create-central.sh
# edit the central and add annotation: "stackrox.io/pause-reconcile: "true"
kubectl -n rhacs-chs64i3dabr0026a6fag edit centrals.platform.stackrox.io test-central-1
# To run tests locally run:
make db/teardown db/setup db/migrate
make ocm/setup OCM_OFFLINE_TOKEN=<ocm-offline-token> OCM_ENV=development
make verify lint binary test test/integration

@vladbologa vladbologa temporarily deployed to development June 1, 2023 09:30 — with GitHub Actions Inactive
@vladbologa vladbologa temporarily deployed to development June 1, 2023 09:30 — with GitHub Actions Inactive
@vladbologa vladbologa temporarily deployed to development June 1, 2023 09:30 — with GitHub Actions Inactive
@openshift-ci openshift-ci bot added the approved label Jun 1, 2023
@vladbologa vladbologa temporarily deployed to development June 1, 2023 09:38 — with GitHub Actions Inactive
@vladbologa vladbologa temporarily deployed to development June 1, 2023 09:38 — with GitHub Actions Inactive
@vladbologa vladbologa temporarily deployed to development June 1, 2023 09:38 — with GitHub Actions Inactive
@vladbologa vladbologa temporarily deployed to development June 1, 2023 10:53 — with GitHub Actions Inactive
@vladbologa vladbologa temporarily deployed to development June 1, 2023 10:53 — with GitHub Actions Inactive
@vladbologa vladbologa temporarily deployed to development June 1, 2023 10:53 — with GitHub Actions Inactive
@vladbologa vladbologa requested a review from a team June 1, 2023 13:14
@@ -36,10 +36,11 @@ const (
FreeStatus int32 = iota
BlockedStatus

PauseReconcileAnnotation = "stackrox.io/pause-reconcile"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be exported here instead, so that we don't duplicate the annotation. Will do in a follow-up, if it's ok.

Copy link
Member

@SimonBaeumer SimonBaeumer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you wait for @stehessel review?

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jun 1, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: kurlov, SimonBaeumer, stehessel, vladbologa

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [SimonBaeumer,kurlov,vladbologa]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@vladbologa vladbologa merged commit d8d3859 into main Jun 2, 2023
@vladbologa vladbologa deleted the vbologa/pause_reconcile_metric branch June 2, 2023 08:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants