Skip to content

Commit

Permalink
Addressed comments, updated.
Browse files Browse the repository at this point in the history
Signed-off-by: bwplotka <[email protected]>
  • Loading branch information
bwplotka committed Dec 17, 2024
1 parent 33a18ca commit 7b4be03
Showing 1 changed file with 48 additions and 31 deletions.
79 changes: 48 additions & 31 deletions proposals/2024-12-04_prombench-custom-benchmarks.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,14 +2,20 @@

* **Owners:**
* [@bwplotka](https://github.com/bwplotka)

* **Implementation Status:** `Not implemented`

* **Contributors:**
* [@krajorama](https://github.com/krajorama)
* [@ArthurSens](https://github.com/ArthurSens)

* **Implementation Status:** `Partially implemented`

* **Related Issues and PRs:**
* https://github.com/prometheus/test-infra/issues/659
* https://github.com/prometheus/prometheus/pull/15682
* https://github.com/prometheus/test-infra/pull/812


> TL;DR: We propose a way for maintainers to deploy custom Prombench benchmarks via `/prombench <...> -bench.version <branch or commit SHA>` PR command. Benchmarks can be then customized via custom branches on https://github.com/prometheus/test-infra repo that modify [any benchmark-specific manifests](https://github.com/prometheus/test-infra/tree/master/prombench/manifests/prombench).
> TL;DR: We propose a way for maintainers to deploy custom Prombench benchmarks. We propose adding two flags to `/prombench`: `--bench.version=<branch or @commit SHA>` and `--bench.directory=<name of the benchmark scenario directory>` to customize benchmark scenarios.
## Why

Expand All @@ -21,7 +27,7 @@ Standard benchmark is a great way to have a uniform understanding of the efficie

In other words, ideally Prombench would support a way to perform a custom benchmark when triggered it via Pull Request in Prometheus repo e.g. `/prombench v3.0.0 <please run agent mode with only native histograms load, OTLP receiving and remote write 2.0 with experimental arrow format`.

Non-exhausting list of custom benchmarks that would be epic to do, discussed among community (`A` and `B` are two Prometheus process benchmark always run):
Non-exhausting list of custom benchmarks that would be epic to do, discussed among community:

* Different configurations, e.g.:
* Custom flag changes (especially [feature flags](https://prometheus.io/docs/prometheus/latest/feature_flags/))
Expand Down Expand Up @@ -49,7 +55,7 @@ Non-exhausting list of custom benchmarks that would be epic to do, discussed amo

### Pitfalls of the current solution

Currently, there is no easy way to run macro-benchmark for new, non-standard configurations and setups (technically there is some limited and hacky way - through manual GKE cluster operations on prombench after it starts).
Currently, there is no easy way to run macro-benchmark for both non-standard and standard configurations and setups.

This causes:

Expand All @@ -59,36 +65,52 @@ This causes:
* Wasted time for people recreating custom macro benchmarks on their own, which are harder to repro or trust.
* Overloading standard scenario with too many or too custom features.

NOTE: Technically there are some limited and hacky ways for custom scenarios e.g. through manual GKE cluster operations on prombench after it starts or by hardcoding new defaults or logic on the benchmarked PR. While possible, they are neither easy or clean ways.

## Goals

* Keep the standard scenario healthy and useful for release gating.
* Ability to run custom benchmarks from the PR.
* Custom benchmarks should be easy and fast to prepare.
* Custom benchmarks should be deterministically reproducible and reviewable (e.g. pinned to a git commit).
* Minimize security risk of bad actor running arbitrary workloads on our prombench cluster (wasting credits).
* Keep Prombench project simple to maintain.

## Non-Goals

We don't discuss here the following changes:

* Ability to run benchmarks from the GitHub Issues.
* Ability to run multiple benchmarks per one PR.

## How

We propose changing Prombench in two phases:
> NOTE: The following proposal is implemented [in this PR](https://github.com/prometheus/test-infra/pull/812).
### Ability to specify verion of the benchmark `/prombench <...> -bench.version <branch or commit SHA>`
First, we propose to maintain the current standard benchmark flow: on the https://github.com/prometheus/test-infra `master` branch, in `/prombench/manifests/prombench` directory, we maintain the standard, single benchmarking scenario used as an acceptance validation for Prometheus. It's important to ensure it represents common Prometheus configuration. The only user related parameter for the standard scenario is `RELEASE` version.

First, we add the ability to deploy from a https://github.com/prometheus/test-infra branch or commit SHA.
On top of that, we propose to add two flags that will allow customizing and testing benchmark scenarios:

### Version flag

Currently, the [Prometheus prombench GH job](https://github.com/prometheus/prometheus/blob/main/.github/workflows/prombench.yml) uses configuration from the `/prombench/manifests/prombench` directory stored in the `docker://prominfra/prombench:master` image.

We propose to add the `--bench.version=<branch|@commit>` flag to `/prombench` GH PR command, with the default value `master`. This flag will cause [the prombench GH action](https://github.com/prometheus/prometheus/blob/main/.github/workflows/prombench.yml) to pull specific commit SHA (if `--bench.version` value is prefixed with `@`) or branch from the https://github.com/prometheus/test-infra before deploying (or cleaning) test benchmark run. For the default `master` value, it will use the existing flow with the `docker://prominfra/prombench:master` image files.

Here are an example steps to customize and run a customized benchmark with `--bench.version` flag.

We implement that by:
1. Create a new branch on https://github.com/prometheus/test-infra e.g. `benchmark/scenario1`.
2. Modify this directory to your liking e.g. changing query load, metric load of advanced Prometheus configuration. It's also possible to make Prometheus deployments and versions exactly the same, but vary in a single configuration flag, for feature benchmarking.

* Changing comment-monitor to support new flag `-bench.version` parsing (default is `main) (@bwplotka: I have a local WIP branch with a refactor of it and changes).
* Pass new flag via [Prometheus GH action env flags](https://github.com/prometheus/prometheus/blob/main/.github/workflows/prombench.yml)
* Extend [`make deploy`](https://github.com/prometheus/test-infra/blob/master/prombench/Makefile#L6) script (or infra CLI):
* The ability to git pull a concrete https://github.com/prometheus/test-infra branch or commit.
* This branch will be used when [applying (non infra) resources](https://github.com/prometheus/test-infra/tree/master/prombench/manifests/prombench).
> WARN: When customizing this directory, don't change `1a_namespace.yaml` or `1c_cluster-role-binding.yaml` filenames as they are used for cleanup routine. Or, if you change it, know what you're doing in relation to [`make clean` job](../../Makefile).
3. Push changes to the new branch.
4. From the Prometheus PR comment, call prombench as `/prombench <release> --bench.version=benchmark/scenario1` or `/prombench <release> --bench.version=@<relevant commit SHA from the benchmark/scenario1>` to use configuration files from this custom branch.

Other details:

* Other custom branch modifications other than to this directory do not affect prombench (e.g. to infra CLI or makefiles).
* `--bench.version` is designed for a short-term or even one-off benchmark scenario configurations. It's not designed for long-term, well maintained scenarios. For the latter reason we can later e.g. maintain multiple `manifests/prombench` directories and introduce a new `--bench.directory` flag.
* Non-maintainers can follow similar process, but they will need to ask maintainer for a new branch and PR review. We can consider extending `--bench.version` to support remote repositories if this becomes a problem.
* Custom benchmarking logic is implemented in the [`maybe_pull_custom_version` make job](https://github.com/prometheus/test-infra/blob/cm-branch/prombench/Makefile#L48) and invoked by the prombench GH job on Prometheus repo on `deploy` and `clean`.

Pros:

Expand All @@ -112,23 +134,13 @@ Cons:
* It's easy to change things on custom branch that will have no effect (e.g. cluster resources), there's little guardrail here.
* There is a slight risk we accidentally merge a manifests change that mine bitcoins (review mistake). However, the same can occur on `main` branch.

### Split Prometheus manifests into two deployments

We propose to add a way for benchmark scenario editors to specify custom A and B configuration (not only version!).
### Directory flag

We implement that by changing manifests and infra CLI, so infra deploy will either:
* Apply just [one test manifests](https://github.com/prometheus/test-infra/blob/master/prombench/manifests/prombench/benchmark/3b_prometheus-test_deployment.yaml) twice for different versions.
* Apply two different manifests for A and B, with the same version.

Pros:
* Allows feature comparison explained in [Why section](#why).

Cons:
* One can "malform" benchmark to run different versions which is likely not useful, but it will be visible in a commit.
Currently, the [Prometheus prombench GH job](https://github.com/prometheus/prometheus/blob/main/.github/workflows/prombench.yml) uses configuration from the `/prombench/manifests/prombench` directory. To allow different "standard" modes (e.g. agent mode) on `master` https://github.com/prometheus/test-infra, we propose to add `--bench.directory=<name of the benchmark scenario directory>`, which defaults to `manifests/prombench`.

## Alternatives
## Alternatives and Extensions

### Alternative: Carefully design and template a set of customization to standard scenarion in `main`
### Extension: Carefully design and template a set of customization to standard scenarion in `main`

One could add a big configuration file/flag surface to `/prombench` allowing to carefully change various aspects.

Expand All @@ -149,4 +161,9 @@ Cons:

## Action Plan

Action plan is explained in [How section](#how), plus docs changes in test-infra repo.
* [X] Refactor comment-monitor CLI to be able to extend with flag parsing easily.
* [X] Extend comment-monitor CLI to support flags (https://github.com/prometheus/test-infra/pull/812)
* [X] Extend `make deploy` and `make clean` to support git pulling on demand and dynamic directory choice. (https://github.com/prometheus/test-infra/pull/812)
* [X] Add docs (https://github.com/prometheus/test-infra/pull/812)
* [X] Extend Prometheus prombench GH job (https://github.com/prometheus/prometheus/pull/15682)
* [ ] Announce on -dev list.

0 comments on commit 7b4be03

Please sign in to comment.