-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf: avoid small performance regressions #106661
Comments
cc @cockroachdb/test-eng |
I've been thinking through how something like this could work recently as well. One suggestion is that we may get better mileage for fine-grained regression detection from larger microbenchmarks than from roachtests. A good microbenchmark to look at is As Tobi mentioned, we would need a way to signal to CI that a certain regression is expected and permitted for a given PR. |
@herkolategan FYI. If the above checks out, maybe we could surface results of |
Epic: none Release note: None Previously cleaning benchmark output was only internally exposed in roachprod-microbench, but we now need it as part of a CI job, hence a new utility command "clean" has been added to roachprod-microbench See: cockroachdb#106661
126810: roachprod-microbench: add clean command r=herkolategan a=sambhav-jain-16 Epic: none Release note: None This change adds a new utility command to `roachprod-microbench` as part of fixing #106661. Co-authored-by: Sambhav Jain <[email protected]>
Epic: none Release note: None Previously cleaning benchmark output was only internally exposed in roachprod-microbench, but we now need it as part of a CI job, hence a new utility command "clean" has been added to roachprod-microbench See: cockroachdb#106661
Epic: none Release note: None As part of fixing cockroachdb#106661, we need to run microbenchmarks in a CI job and compare the microbenchmarks with that of the latest master branch commit. For this purpose we want to re-use `roachprod-microbench` compare command. However, as of current implementation of the compare command, it doesn't do threshold comparison and this PR aims to add that capability using a flag. It is a follow up change to cockroachdb#126810 which exposed `roachprod-microbench` capability of cleaning the benchmark output.
Fixes: cockroachdb#106661 Release note: None Epic: https://cockroachlabs.atlassian.net/browse/CRDB-29662 This change intends to add a new CI step which will run `BenchmarkKV` microbenchmark for the current commit and compares the result with that of the base commit's results which is stored in a cloud bucket. This is needed to avoid minor perf regressions so that any regression is caught in the PR itself. The current threshold is set to 5%. There are highly 3 steps 1. Run benchmarks 2. If this is PR branch, download the base commit's bench results from the cloud storage and compare the results. If there's a regression fail here itself 3. If it is a release/master branch, push the result to the cloud storage with the current commit sha. In cases that a regression is expected, the developers can add [PERF_REGRESSION_EXPECTED] in the PR title which skips the step. (only for PR branches) <what was there before: Previously, ...> <why it needed to change: This was inadequate because ...> <what you did about it: To address this, this patch ...>
127228: workflow: add cockroach-microbench-ci step in github actions r=rickystewart,herkolategan a=sambhav-jain-16 Fixes: #106661 Release note: None Epic: https://cockroachlabs.atlassian.net/browse/CRDB-29662 This change intends to add a new CI step which will run `BenchmarkKV` microbenchmark for the current commit and compares the result with that of the base commit's results which is stored in a cloud bucket. This is needed to avoid minor perf regressions so that any regression is caught in the PR itself. The current threshold is set to 5%. There are highly 3 steps 1. Run benchmarks 2. If this is PR branch, download the latest commit of the target branch's bench results from the cloud storage and compare the results. If there's a regression fail here itself 3. If it is a push step (PR merge to master/release), upload the result to the cloud storage with the current commit sha. Currently, this change won't block the PR. However it will still run the microbenchmarks and upload the results during the push step. Co-authored-by: Sambhav Jain <[email protected]>
…d` is set Epic: https://cockroachlabs.atlassian.net/browse/CRDB-40772 Fixes: cockroachdb#106661 Release note: None
This change aims to make `cockroach-microbench-ci` a required step. There is an addition of a label `performance-regression-expected` that can be used to skip the comparison in case a regression is expected Epic: https://cockroachlabs.atlassian.net/browse/CRDB-40772 Fixes: cockroachdb#106661 Release note: None
This change aims to make `cockroach-microbench-ci` a required step. There is an addition of a label `performance-regression-expected` that can be used to skip the comparison in case a regression is expected Epic: https://cockroachlabs.atlassian.net/browse/CRDB-40772 Fixes: cockroachdb#106661 Release note: None
This change aims to make `cockroach-microbench-ci` a required step. There is an addition of a label `performance-regression-expected` that can be used to skip the comparison in case a regression is expected Epic: https://cockroachlabs.atlassian.net/browse/CRDB-40772 Fixes: cockroachdb#106661 Release note: None
This change aims to make `cockroach-microbench-ci` a required step. There is an addition of a label `performance-regression-expected` that can be used to skip the comparison in case a regression is expected Epic: https://cockroachlabs.atlassian.net/browse/CRDB-40772 Fixes: cockroachdb#106661 Release note: None
This change aims to make `cockroach-microbench-ci` a required step. There is an addition of a label `performance-regression-expected` that can be used to skip the comparison in case a regression is expected Epic: https://cockroachlabs.atlassian.net/browse/CRDB-40772 Fixes: cockroachdb#106661 Release note: None
This change aims to make `cockroach-microbench-ci` a required step. There is an addition of a label `performance-regression-expected` that can be used to skip the comparison in case a regression is expected Epic: https://cockroachlabs.atlassian.net/browse/CRDB-40772 Fixes: cockroachdb#106661 Release note: None
This change aims to make `cockroach-microbench-ci` a required step. There is an addition of a label `performance-regression-expected` that can be used to skip the comparison in case a regression is expected Epic: https://cockroachlabs.atlassian.net/browse/CRDB-40772 Fixes: cockroachdb#106661 Release note: None
Reopened since we haven't solved this issue yet. In light of the spirited discussion on #135899 I'd like to propose that we slightly change the scope of the issue away from "avoiding" small perf impressions by blocking merge jobs to providing high-quality signal about perf regressions during the PR workflow for a small (initially tiny) set of core benchmarks. This should check the following boxes:
I think we can find 1-3 good initial benchmarks, sourced from So assume we have good coverage for an initial set of important hot code paths with benchmarks that are fairly stable. We can run them benchdiff-style (say ten reps of old and new binary each, prescribed benchtime-count, run interleaved [old new old new. ....]) and do the benchstat-style regression check. Allocation bytes/counts tend to be more stable than CPU time (in my experience), so we can set different thresholds for each. If we find a regression (or we could do this unconditionally, too - doesn't take much time) we would also invoke the old and new benchmark an additional time and collect memory and CPU profiles. The report would then contain these alongside the invocations to re-run the comparison locally, as well as the right pprof invocations to view the before, the after, and the delta profiles (after minus before). For starters, a I'm a little unsure about notifications. I think a reasonable way to start is that we introduce this check without any notification on the PR; you'll have to click "Details" on the microbench-CI job to see what transpired. Once we've stabilized the check and are confident that it provides good signal, we can move to posting a (helpful) comment on the PR thread should anything have become measurably faster or slower. Everything past this line is not something I'd consider in the initial scope of this issue. In principle I would prefer (at some point) a scheme where we don't make multiple such comments but update the existing one. I could even see us inserting this information into the PR message comment itself (but that's likely a great thing to disagree with at least someone on). Adding a label to the PR could be another subtle way to "notify" that there's something of interest. There are many next steps we can take after this. We can discuss making the checks required (perhaps validating this decision with additional benchmarking if the initial check detects a regression), perhaps re-framing what this means: rather than enforcing that "regressions aren't allowed", we would require the author to acknowledge the regression and provide a justification for why it's okay before proceeding to merge, meaning that we have a paper trail of how the author/reviewers thought about the regression. We can also discuss how regressions should be treated in bors (my hope would be that we just don't need that ever). |
I've mainly been using I've also been using it simply because of my familiarity with it. I think that |
Just a thought, |
Just noting a false positive here: https://github.com/cockroachdb/cockroach/actions/runs/12237982508/job/34134987461?pr=137014 |
Following the discussions in cockroachdb#135899. This job is being disabled until we are able to figure out how to remove the variance in microbenchmarks cockroachdb#106661 Epic: none Release note: None
Following the discussions in cockroachdb#135899. This job is being removed until we are able to figure out how to remove the variance in microbenchmarks cockroachdb#106661 Epic: none Release note: None
137075: workflow: remove cockroach-microbench-ci r=srosenberg a=sambhav-jain-16 Following the discussions in #135899. This job is being removed until we are able to figure out how to remove the variance in microbenchmarks #106661 Epic: none Release note: None Co-authored-by: Sambhav Jain <[email protected]>
Here's an example of the kind of thing our tooling could point out and discourage the merge of: #137542 |
Indeed, that was a fairly large regression. Confirming that it was reflected in the weekly run of microbenches. The regressed commit just happens to be the latest on master at the time of the run; we'd obviously need to bisect to find the first commit. |
Is your feature request related to a problem? Please describe.
We currently rely on changepoint detection and manual inspection on https://roachperf.crdb.dev/ to catch performance regressions. We are usually only able to claw back large regressions, which are bisected to a commit in a labor-intensive process (which we've been automating a bit, but still).
But looking at many graphs (like kv95) we see that we are continually backsliding. It's easy to see why: allocations are expensive, Go makes it hard to avoid accidental allocations, there is no forcing function in CI, et cetera.
Describe the solution you'd like
I'd like some way to avoid this continuous backsliding. My expectation is that the best way to achieve this is by preventing at least some classes of regressions in CI. For example, by comparing op/s on a small set of representative benchmarks against the parent commit, or by controlling for allocations (though regressions can also be introduced by just adding code!)
Describe alternatives you've considered
Additional context
Jira issue: CRDB-29662
The text was updated successfully, but these errors were encountered: