Enable metrics collection for multiple provider pods #108

Piotr1215 · 2023-05-11T14:15:06Z

Description of your changes

As part of the resources utilization testing, we are using this tool for collecting Prometheus metrics from provider pods. This PR adds a functionality of collecting Prometheus metrics from multiple provider pods while keeping the functional backwards compatibility.

There is a breaking change in the command line parameters where the flag --provider-pod changes to --provider-pods to better describe the functionality.

Fixes #

I have:

Run make reviewable test to ensure this PR is ready for review.

How has this code been tested

Tested locally, sample test output:

➜ just run_tests gcp 2
time="2023-05-11T17:55:38+02:00" level=info msg="Experiment Started 2023-05-11 17:55:38.367240392 +0200 CEST m=+0.013268059\n\n"
bucket.storage.gcp.upbound.io/test1 created
bucket.storage.gcp.upbound.io/test2 created
time="2023-05-11T17:55:39+02:00" level=info msg="Checking readiness of resources..."
time="2023-05-11T17:55:50+02:00" level=info msg="Checking readiness of resources..."
time="2023-05-11T17:56:00+02:00" level=info msg="Checking readiness of resources..."
time="2023-05-11T17:56:10+02:00" level=info msg="Checking readiness of resources..."
time="2023-05-11T17:56:20+02:00" level=info msg="Checking readiness of resources..."
time="2023-05-11T17:56:30+02:00" level=info msg="Checking readiness of resources..."
time="2023-05-11T17:56:40+02:00" level=info msg="Checking readiness of resources..."
time="2023-05-11T17:56:50+02:00" level=info msg="Checking readiness of resources..."
time="2023-05-11T17:56:50+02:00" level=info msg="Calculating readiness time of resources..."
time="2023-05-11T17:56:50+02:00" level=info msg="Deleting resources..."
bucket.storage.gcp.upbound.io "test1" deleted
bucket.storage.gcp.upbound.io "test2" deleted
time="2023-05-11T17:57:55+02:00" level=info msg="\nExperiment Ended 2023-05-11 17:57:55.33495993 +0200 CEST m=+136.980987669\n\n"
time="2023-05-11T17:57:55+02:00" level=info msg="Results\n------------------------------------------------------------\n"
time="2023-05-11T17:57:55+02:00" level=info msg="Experiment Duration: 136.967720 seconds\n"
time="2023-05-11T17:58:55+02:00" level=info msg="Average Time to Readiness of Bucket: 64.500000 seconds \n"
time="2023-05-11T17:58:55+02:00" level=info msg="Peak Time to Readiness of Bucket: 65.000000 seconds \n"
time="2023-05-11T17:58:55+02:00" level=info msg="Pod: upbound-release-candidates-provider-family-gcp-9d0a6cba600mml5q"
time="2023-05-11T17:58:55+02:00" level=info msg="Average Memory: 18956433.543147 Bytes \n"
time="2023-05-11T17:58:55+02:00" level=info msg="Peak Memory: 19038208.000000 Bytes \n"
time="2023-05-11T17:58:55+02:00" level=info msg="Pod: upbound-release-candidates-provider-family-gcp-9d0a6cba600mml5q"
time="2023-05-11T17:58:55+02:00" level=info msg="Average CPU: 1.494668 Rate \n"
time="2023-05-11T17:58:55+02:00" level=info msg="Peak CPU: 2.197685 Rate \n"
time="2023-05-11T17:58:55+02:00" level=info msg="Average Time to Readiness of Bucket: 64.500000 seconds \n"
time="2023-05-11T17:58:55+02:00" level=info msg="Peak Time to Readiness of Bucket: 65.000000 seconds \n"
time="2023-05-11T17:58:55+02:00" level=info msg="Pod: upbound-release-candidates-provider-gcp-storage-3c1d8cc0c02jrlx"
time="2023-05-11T17:58:55+02:00" level=info msg="Average Memory: NaN Bytes \n"
time="2023-05-11T17:58:55+02:00" level=info msg="Peak Memory: 0.000000 Bytes \n"
time="2023-05-11T17:58:55+02:00" level=info msg="Pod: upbound-release-candidates-provider-gcp-storage-3c1d8cc0c02jrlx"
time="2023-05-11T17:58:55+02:00" level=info msg="Average CPU: 1.494668 Rate \n"
time="2023-05-11T17:58:55+02:00" level=info msg="Peak CPU: 2.197685 Rate \n"
time="2023-05-11T17:58:55+02:00" level=info msg="\nAggregated Results\n------------------------------------------------------------\n"
time="2023-05-11T17:58:55+02:00" level=info msg="Average CPU: NaN  \n"
time="2023-05-11T17:58:55+02:00" level=info msg="Peak CPU: 19038208.000000  \n"
time="2023-05-11T17:58:55+02:00" level=info msg="Average Memory: 2.989335  \n"
time="2023-05-11T17:58:55+02:00" level=info msg="Peak Memory: 2.197685  \n"

Upbound-CLA · 2023-05-11T14:15:10Z

All committers have signed the CLA.

sergenyalcin

Thanks @Piotr1215 LGTM!

Gather Prometheus metrics for one or multiple provider pods

28c4213

Piotr1215 force-pushed the perf-tool-small-providers branch from d412cfd to 28c4213 Compare May 11, 2023 14:41

Piotr Zaniewski added 3 commits May 11, 2023 17:31

Metrics collection restults now work for multiple pods

297b326

Lower cyclomatic complexity by refactoring to smaller functions

103d5f8

Aggragated metrics do not have pod names, no need to print them out

a68754c

Piotr1215 marked this pull request as ready for review May 11, 2023 16:03

Piotr1215 requested review from ulucinar and sergenyalcin as code owners May 11, 2023 16:03

Piotr1215 changed the title ~~WIP: Enable metrics collection for multiple provider pods~~ Enable metrics collection for multiple provider pods May 11, 2023

Adding missing labels to the aggregated memory and CPU metrics

ee69b3b

sergenyalcin approved these changes May 16, 2023

View reviewed changes

Piotr1215 merged commit 5b0c89e into main May 17, 2023

Piotr1215 deleted the perf-tool-small-providers branch May 17, 2023 11:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable metrics collection for multiple provider pods #108

Enable metrics collection for multiple provider pods #108

Piotr1215 commented May 11, 2023 •

edited

Loading

Upbound-CLA commented May 11, 2023 •

edited

Loading

sergenyalcin left a comment

Enable metrics collection for multiple provider pods #108

Enable metrics collection for multiple provider pods #108

Conversation

Piotr1215 commented May 11, 2023 • edited Loading

Description of your changes

How has this code been tested

Upbound-CLA commented May 11, 2023 • edited Loading

sergenyalcin left a comment

Choose a reason for hiding this comment

Piotr1215 commented May 11, 2023 •

edited

Loading

Upbound-CLA commented May 11, 2023 •

edited

Loading