From 33a18cab7d0f9a21dc5aebbe0194e791c059615a Mon Sep 17 00:00:00 2001 From: bwplotka Date: Mon, 2 Dec 2024 07:39:46 +0000 Subject: [PATCH 1/4] Propose prombench flexible scenarios Signed-off-by: bwplotka --- .../2024-12-04_prombench-custom-benchmarks.md | 152 ++++++++++++++++++ 1 file changed, 152 insertions(+) create mode 100644 proposals/2024-12-04_prombench-custom-benchmarks.md diff --git a/proposals/2024-12-04_prombench-custom-benchmarks.md b/proposals/2024-12-04_prombench-custom-benchmarks.md new file mode 100644 index 0000000..8cd0352 --- /dev/null +++ b/proposals/2024-12-04_prombench-custom-benchmarks.md @@ -0,0 +1,152 @@ +# Prombench Custom Benchmarks + +* **Owners:** + * [@bwplotka](https://github.com/bwplotka) + +* **Implementation Status:** `Not implemented` + +* **Related Issues and PRs:** + * https://github.com/prometheus/test-infra/issues/659 + + +> TL;DR: We propose a way for maintainers to deploy custom Prombench benchmarks via `/prombench <...> -bench.version ` PR command. Benchmarks can be then customized via custom branches on https://github.com/prometheus/test-infra repo that modify [any benchmark-specific manifests](https://github.com/prometheus/test-infra/tree/master/prombench/manifests/prombench). + +## Why + +[Prombench](https://github.com/prometheus/test-infra/tree/master/prombench) served us well with [around ~110 Prometheus macro-benchmarks performed](https://github.com/prometheus/prometheus/actions/workflows/prombench.yml?query=is%3Asuccess+event%3Arepository_dispatch) since the last year. + +We invest and maintain a healthy **standard benchmarking scenario** that starts two nodepools with one Prometheus each, one built from the PR, second from the referenced release. For each Prometheus we deploy a separate `fake-webserver` to scrape and load-generator that performs stable load of the PromQL queries. Recently thanks to [the LFX mentee](https://github.com/cncf/mentoring/tree/main/programs/lfx-mentorship/2024/03-Sep-Nov#enhance-prometheus-benchmark-suitem) [Kushal](https://github.com/kushalShukla-web) a standard scenario also remote writes to a sink to benchmark the writing overhead. + +Standard benchmark is a great way to have a uniform understanding of the efficiency for the generic Prometheus setup, especially useful before Prometheus releases. However, with more contributors and Prometheus features, there is an increasing amount of reasons to perform special, perhaps one-off benchmarking scenarios that target specific feature flags, Prometheus modes, custom setups or to elevate certain issues. + +In other words, ideally Prombench would support a way to perform a custom benchmark when triggered it via Pull Request in Prometheus repo e.g. `/prombench v3.0.0 -bench.version ` + +First, we add the ability to deploy from a https://github.com/prometheus/test-infra branch or commit SHA. + +We implement that by: + +* Changing comment-monitor to support new flag `-bench.version` parsing (default is `main) (@bwplotka: I have a local WIP branch with a refactor of it and changes). +* Pass new flag via [Prometheus GH action env flags](https://github.com/prometheus/prometheus/blob/main/.github/workflows/prombench.yml) +* Extend [`make deploy`](https://github.com/prometheus/test-infra/blob/master/prombench/Makefile#L6) script (or infra CLI): + * The ability to git pull a concrete https://github.com/prometheus/test-infra branch or commit. + * This branch will be used when [applying (non infra) resources](https://github.com/prometheus/test-infra/tree/master/prombench/manifests/prombench). + +Pros: + +* Does not change std scenario. +* No need to rebuild any docker image. +* Allows Prometheus team to customize almost all the aspects of the benchmark like: + * metric load + * query load + * number of replicas + * prometheus config, flags and recording rules + * remote write sink options (e.g introduce failures) +* Prombench core framework is intact - no cluster resources can be modified this way. +* Prometheus team can maintain a few official branches for common scenarios e.g. `agent` +* Anybody can inspect what will be deployed by checking https://github.com/prometheus/test-infra repo. +* Only team can change manifests (contributors will need to create PRs). + +Cons: + +* Some knowledge how benchmark is deployed (e.g. what pods are running and from where) is required. + * **Mitigation**: Document what's possible. +* It's easy to change things on custom branch that will have no effect (e.g. cluster resources), there's little guardrail here. +* There is a slight risk we accidentally merge a manifests change that mine bitcoins (review mistake). However, the same can occur on `main` branch. + +### Split Prometheus manifests into two deployments + +We propose to add a way for benchmark scenario editors to specify custom A and B configuration (not only version!). + +We implement that by changing manifests and infra CLI, so infra deploy will either: +* Apply just [one test manifests](https://github.com/prometheus/test-infra/blob/master/prombench/manifests/prombench/benchmark/3b_prometheus-test_deployment.yaml) twice for different versions. +* Apply two different manifests for A and B, with the same version. + +Pros: +* Allows feature comparison explained in [Why section](#why). + +Cons: +* One can "malform" benchmark to run different versions which is likely not useful, but it will be visible in a commit. + +## Alternatives + +### Alternative: Carefully design and template a set of customization to standard scenarion in `main` + +One could add a big configuration file/flag surface to `/prombench` allowing to carefully change various aspects. + +Pros: +* More control and guardrail for what scenarios is possible. + +Cons: +* Too much work to maintain, test and validate a custom templating and configuration surface. Changing manifests directly is good enough. +* Too much work to customize scenarios. + +### Alternative: Change docker image (e.g. tag) in GH action + +Initially I thought we could simply change a [docker image GH action is using](https://github.com/prometheus/prometheus/blob/main/.github/workflows/prombench.yml#L34) to deploy the prombench benchmark manifests. + +Cons: +* Not really possible -- `uses` field [cannot be parameterized](https://stackoverflow.com/a/75377583) +* It would require not only a git branch on https://github.com/prometheus/test-infra but also rebuilding image. Doable with some CI, but unnecessarily heavy. + +## Action Plan + +Action plan is explained in [How section](#how), plus docs changes in test-infra repo. From dcd2f588ac942a17dac41e0f02a67d259faa6984 Mon Sep 17 00:00:00 2001 From: bwplotka Date: Tue, 17 Dec 2024 14:58:27 +0000 Subject: [PATCH 2/4] Addressed comments, updated. Signed-off-by: bwplotka --- .../2024-12-04_prombench-custom-benchmarks.md | 100 ++++++++++++------ 1 file changed, 68 insertions(+), 32 deletions(-) diff --git a/proposals/2024-12-04_prombench-custom-benchmarks.md b/proposals/2024-12-04_prombench-custom-benchmarks.md index 8cd0352..d408d6b 100644 --- a/proposals/2024-12-04_prombench-custom-benchmarks.md +++ b/proposals/2024-12-04_prombench-custom-benchmarks.md @@ -2,14 +2,20 @@ * **Owners:** * [@bwplotka](https://github.com/bwplotka) - -* **Implementation Status:** `Not implemented` + +* **Contributors:** + * [@krajorama](https://github.com/krajorama) + * [@ArthurSens](https://github.com/ArthurSens) + +* **Implementation Status:** `Partially implemented` * **Related Issues and PRs:** * https://github.com/prometheus/test-infra/issues/659 + * https://github.com/prometheus/prometheus/pull/15682 + * https://github.com/prometheus/test-infra/pull/812 -> TL;DR: We propose a way for maintainers to deploy custom Prombench benchmarks via `/prombench <...> -bench.version ` PR command. Benchmarks can be then customized via custom branches on https://github.com/prometheus/test-infra repo that modify [any benchmark-specific manifests](https://github.com/prometheus/test-infra/tree/master/prombench/manifests/prombench). +> TL;DR: We propose a way for maintainers to deploy custom Prombench benchmarks. We propose adding two flags to `/prombench`: `--bench.version=` and `--bench.directory=` to customize benchmark scenarios. ## Why @@ -21,7 +27,7 @@ Standard benchmark is a great way to have a uniform understanding of the efficie In other words, ideally Prombench would support a way to perform a custom benchmark when triggered it via Pull Request in Prometheus repo e.g. `/prombench v3.0.0 NOTE: The following proposal is implemented [in this PR](https://github.com/prometheus/test-infra/pull/812). + +First, we propose to maintain the current standard benchmark flow: on the https://github.com/prometheus/test-infra `master` branch, in `/prombench/manifests/prombench` directory, we maintain the standard, single benchmarking scenario used as an acceptance validation for Prometheus. It's important to ensure it represents common Prometheus configuration. The only user related parameter for the standard scenario is `RELEASE` version. + +On top of that, we propose to add two flags that will allow customizing and testing benchmark scenarios: + +### Version flag + +Currently, the [Prometheus prombench GH job](https://github.com/prometheus/prometheus/blob/main/.github/workflows/prombench.yml) uses configuration from the `/prombench/manifests/prombench` directory stored in the `docker://prominfra/prombench:master` image. -### Ability to specify verion of the benchmark `/prombench <...> -bench.version ` +We propose to add the `--bench.version=` flag to `/prombench` GH PR command, with the default value `master`. This flag will cause [the prombench GH action](https://github.com/prometheus/prometheus/blob/main/.github/workflows/prombench.yml) to pull specific commit SHA (if `--bench.version` value is prefixed with `@`) or branch from the https://github.com/prometheus/test-infra before deploying (or cleaning) test benchmark run. For the default `master` value, it will use the existing flow with the `docker://prominfra/prombench:master` image files. + +Here are an example steps to customize and run a customized benchmark with `--bench.version` flag. + +1. Create a new branch on https://github.com/prometheus/test-infra e.g. `benchmark/scenario1`. +2. Modify this directory to your liking e.g. changing query load, metric load of advanced Prometheus configuration. It's also possible to make Prometheus deployments and versions exactly the same, but vary in a single configuration flag, for feature benchmarking. + + > WARN: When customizing this directory, don't change `1a_namespace.yaml` or `1c_cluster-role-binding.yaml` filenames as they are used for cleanup routine. Or, if you change it, know what you're doing in relation to [`make clean` job](../../Makefile). -First, we add the ability to deploy from a https://github.com/prometheus/test-infra branch or commit SHA. +3. Push changes to the new branch. +4. From the Prometheus PR comment, call prombench as `/prombench --bench.version=benchmark/scenario1` or `/prombench --bench.version=@` to use configuration files from this custom branch. -We implement that by: +Other details: -* Changing comment-monitor to support new flag `-bench.version` parsing (default is `main) (@bwplotka: I have a local WIP branch with a refactor of it and changes). -* Pass new flag via [Prometheus GH action env flags](https://github.com/prometheus/prometheus/blob/main/.github/workflows/prombench.yml) -* Extend [`make deploy`](https://github.com/prometheus/test-infra/blob/master/prombench/Makefile#L6) script (or infra CLI): - * The ability to git pull a concrete https://github.com/prometheus/test-infra branch or commit. - * This branch will be used when [applying (non infra) resources](https://github.com/prometheus/test-infra/tree/master/prombench/manifests/prombench). +* Other custom branch modifications other than to this directory do not affect prombench (e.g. to infra CLI or makefiles). +* `--bench.version` is designed for a short-term or even one-off benchmark scenario configurations. It's not designed for long-term, well maintained scenarios. For the latter reason we can later e.g. maintain multiple `manifests/prombench` directories and introduce a new `--bench.directory` flag. +* Non-maintainers can follow similar process, but they will need to ask maintainer for a new branch and PR review. We can consider extending `--bench.version` to support remote repositories if this becomes a problem. +* Custom benchmarking logic is implemented in the [`maybe_pull_custom_version` make job](https://github.com/prometheus/test-infra/blob/cm-branch/prombench/Makefile#L48) and invoked by the prombench GH job on Prometheus repo on `deploy` and `clean`. Pros: * Does not change std scenario. +* Prombench core framework is intact - no cluster resources can be modified this way. * No need to rebuild any docker image. * Allows Prometheus team to customize almost all the aspects of the benchmark like: * metric load @@ -100,8 +123,6 @@ Pros: * number of replicas * prometheus config, flags and recording rules * remote write sink options (e.g introduce failures) -* Prombench core framework is intact - no cluster resources can be modified this way. -* Prometheus team can maintain a few official branches for common scenarios e.g. `agent` * Anybody can inspect what will be deployed by checking https://github.com/prometheus/test-infra repo. * Only team can change manifests (contributors will need to create PRs). @@ -109,35 +130,45 @@ Cons: * Some knowledge how benchmark is deployed (e.g. what pods are running and from where) is required. * **Mitigation**: Document what's possible. +* When reviewing a PR with prombench results, you have to also review the custom benchmark version to trust the efficiency results. + * **Mitigation**: Give a ready link to the directory used in the benchmark. * It's easy to change things on custom branch that will have no effect (e.g. cluster resources), there's little guardrail here. -* There is a slight risk we accidentally merge a manifests change that mine bitcoins (review mistake). However, the same can occur on `main` branch. +* There is a slight risk we accidentally merge a manifests change that mine bitcoins (review mistake). However, the same can occur on the `master` branch. -### Split Prometheus manifests into two deployments +### Directory flag -We propose to add a way for benchmark scenario editors to specify custom A and B configuration (not only version!). +Currently, the [Prometheus prombench GH job](https://github.com/prometheus/prometheus/blob/main/.github/workflows/prombench.yml) uses configuration from the `/prombench/manifests/prombench` directory. To allow different "standard" modes (e.g. agent mode) on `master` https://github.com/prometheus/test-infra, we propose to add `--bench.directory=`, which defaults to `manifests/prombench`. -We implement that by changing manifests and infra CLI, so infra deploy will either: -* Apply just [one test manifests](https://github.com/prometheus/test-infra/blob/master/prombench/manifests/prombench/benchmark/3b_prometheus-test_deployment.yaml) twice for different versions. -* Apply two different manifests for A and B, with the same version. +Different directories should have all manifests, which may result in some duplication compared with `manifests/prombench`. Symlink could be used to make it obvious what file changes, what didn't. Pros: -* Allows feature comparison explained in [Why section](#why). -Cons: -* One can "malform" benchmark to run different versions which is likely not useful, but it will be visible in a commit. +* Prometheus team can maintain a few official directories for common scenarios e.g. `agent`. -## Alternatives +## Alternatives and Extensions -### Alternative: Carefully design and template a set of customization to standard scenarion in `main` +### Extension: Carefully design and template a set of customization to standard scenarion in `main` One could add a big configuration file/flag surface to `/prombench` allowing to carefully change various aspects. +e.g. +``` +/prombench main +/extra-args --enable-features=native-histograms,wal-records --web.enable-otlp-receiver +/with avalanche +/with agent-mode +``` + +This is similar to how [Prow](https://docs.prow.k8s.io/docs/overview/) parses comments to add extra functionality to plugins. + Pros: * More control and guardrail for what scenarios is possible. Cons: -* Too much work to maintain, test and validate a custom templating and configuration surface. Changing manifests directly is good enough. -* Too much work to customize scenarios. +* Huge amount of work to customize, maintain, test and validate a custom templating and configuration surface. Changing manifests directly is good enough. +* Users might waste time trying to discover a totally new configuration API. + +This is out of scope of this proposal, but it's possible to add in the future. ### Alternative: Change docker image (e.g. tag) in GH action @@ -149,4 +180,9 @@ Cons: ## Action Plan -Action plan is explained in [How section](#how), plus docs changes in test-infra repo. +* [X] Refactor comment-monitor CLI to be able to extend with flag parsing easily. +* [X] Extend comment-monitor CLI to support flags (https://github.com/prometheus/test-infra/pull/812) +* [X] Extend `make deploy` and `make clean` to support git pulling on demand and dynamic directory choice. (https://github.com/prometheus/test-infra/pull/812) +* [X] Add docs (https://github.com/prometheus/test-infra/pull/812) +* [X] Extend Prometheus prombench GH job (https://github.com/prometheus/prometheus/pull/15682) +* [ ] Announce on -dev list. From 194033c09739a0b0fb86a3f6fd2db6047ba133d2 Mon Sep 17 00:00:00 2001 From: Bartlomiej Plotka Date: Wed, 18 Dec 2024 14:13:39 +0000 Subject: [PATCH 3/4] Update proposals/2024-12-04_prombench-custom-benchmarks.md Co-authored-by: George Krajcsovits Signed-off-by: Bartlomiej Plotka --- proposals/2024-12-04_prombench-custom-benchmarks.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/2024-12-04_prombench-custom-benchmarks.md b/proposals/2024-12-04_prombench-custom-benchmarks.md index d408d6b..58a3bae 100644 --- a/proposals/2024-12-04_prombench-custom-benchmarks.md +++ b/proposals/2024-12-04_prombench-custom-benchmarks.md @@ -98,7 +98,7 @@ We propose to add the `--bench.version=` flag to `/prombench` GH Here are an example steps to customize and run a customized benchmark with `--bench.version` flag. 1. Create a new branch on https://github.com/prometheus/test-infra e.g. `benchmark/scenario1`. -2. Modify this directory to your liking e.g. changing query load, metric load of advanced Prometheus configuration. It's also possible to make Prometheus deployments and versions exactly the same, but vary in a single configuration flag, for feature benchmarking. +2. Modify the contents of `/prombench/manifests/prombench` directory to your liking e.g. changing query load, metric load of advanced Prometheus configuration. It's also possible to make Prometheus deployments and versions exactly the same, but vary in a single configuration flag, for feature benchmarking. > WARN: When customizing this directory, don't change `1a_namespace.yaml` or `1c_cluster-role-binding.yaml` filenames as they are used for cleanup routine. Or, if you change it, know what you're doing in relation to [`make clean` job](../../Makefile). From cd53dd3345cc26904e6786c6c941402629b2eefc Mon Sep 17 00:00:00 2001 From: bwplotka Date: Fri, 20 Dec 2024 11:40:17 +0000 Subject: [PATCH 4/4] Moved to implemented state. Signed-off-by: bwplotka --- proposals/2024-12-04_prombench-custom-benchmarks.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/proposals/2024-12-04_prombench-custom-benchmarks.md b/proposals/2024-12-04_prombench-custom-benchmarks.md index 58a3bae..24525cd 100644 --- a/proposals/2024-12-04_prombench-custom-benchmarks.md +++ b/proposals/2024-12-04_prombench-custom-benchmarks.md @@ -7,7 +7,7 @@ * [@krajorama](https://github.com/krajorama) * [@ArthurSens](https://github.com/ArthurSens) -* **Implementation Status:** `Partially implemented` +* **Implementation Status:** `Implemented` * **Related Issues and PRs:** * https://github.com/prometheus/test-infra/issues/659 @@ -185,4 +185,4 @@ Cons: * [X] Extend `make deploy` and `make clean` to support git pulling on demand and dynamic directory choice. (https://github.com/prometheus/test-infra/pull/812) * [X] Add docs (https://github.com/prometheus/test-infra/pull/812) * [X] Extend Prometheus prombench GH job (https://github.com/prometheus/prometheus/pull/15682) -* [ ] Announce on -dev list. +* [X] Announce on -dev list.