Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ROX-15136: filter aws metrics by cluster name #857

Merged
merged 1 commit into from
Apr 11, 2023

Conversation

stehessel
Copy link
Contributor

@stehessel stehessel commented Feb 27, 2023

Description

This is a follow up to #846, which added cluster name tags to the Central RDS instances. Now we want to filter the exported metrics based on these cluster names in the cloudwatch exporter. This is done to allow us to only export metrics for RDS instances which belong to a specific ACS data plane region.

I decided to switch the exporter from https://github.com/prometheus/cloudwatch_exporter to https://github.com/nerdswords/yet-another-cloudwatch-exporter, because it has better filter and region support.

The metrics themselves are not changed by this, only the labels change slightly. Most notably, the dimension label will change from dbinstance_identifier to dimension_DBInstanceIdentifier. We will have to change this in the alerts and dashboards.

Checklist (Definition of Done)

  • Unit and integration tests added
  • Added test description under Test manual
  • Documentation added if necessary (i.e. changes to dev setup, test execution, ...)
  • CI and all relevant tests are passing
  • Add the ticket number to the PR title if available, i.e. ROX-12345: ...
  • Discussed security and business related topics privately. Will move any security and business related topics that arise to private communication channel.

Test manual

Tested the exporter on dev cluster.

@stehessel stehessel temporarily deployed to development February 27, 2023 16:19 — with GitHub Actions Inactive
@stehessel stehessel requested a review from a team February 27, 2023 16:20
@stehessel
Copy link
Contributor Author

/retest

1 similar comment
@stehessel
Copy link
Contributor Author

/retest

@@ -5,4 +5,4 @@ aws:

clusterName: ""
environment: ""
image: "quay.io/prometheus/cloudwatch-exporter:v0.15.1"
image: "ghcr.io/nerdswords/yet-another-cloudwatch-exporter:v0.48.0-alpha"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because it has better filter and region support

To make it more concrete: It allows to pull metrics from multiple regions, it supports auto discovery of resources via tags (and it's written in Golang rather than Java 😅 ).

@stehessel stehessel force-pushed the ROX-15136/filter-by-cluster-name branch from 0a1d490 to 2228297 Compare March 2, 2023 15:23
@stehessel stehessel force-pushed the ROX-15136/filter-by-cluster-name branch from 2228297 to 8ae0ee8 Compare March 2, 2023 15:25
@stehessel stehessel temporarily deployed to development March 2, 2023 15:25 — with GitHub Actions Inactive
@stehessel
Copy link
Contributor Author

rebased to main to fix e2e test

@stehessel
Copy link
Contributor Author

/retest

ports:
- name: monitoring
containerPort: 9106
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that port forwarded ? Does any service require adjustment here ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's referenced by the pod monitor by the port name monitoring. So no other change should be necessary.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 29, 2023

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: rhybrillou, stehessel
Once this PR has been reviewed and has the lgtm label, please assign simonbaeumer for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants