-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
release: 22.2 microbenchmark regression investigation and sign-off #87685
Comments
Don't see anything for |
Could be, I will check back on Thursday and run it, if I missed it. |
Did |
I did a quick @mgartner It seem the |
@mgartner |
@herkolategan for the kv stuff I don't see them in the spreadsheet. It looks like the command you used does not include them? I see this command:
which is the one that contains pkg/gossip, pkg/roachpb, pkg/rpc and pkg/server? |
@knz Right, that doesn't make sense, I should have verified the template I used more thoroughly.
|
Signing off on Signing off on Signing off on #89410 and #89545 are marked as GA blockers until further triage. |
What do people think about running the microbenchmarks with longer For example, I have been looking at a possible regression in one of the benchmarks from the sheet in this issue (from
Here a single run takes on the order of 2-3 seconds, so there is quite a bit of variance, and there appear to be a significant regression. Once I gave 10s of bench time to each run (I was using
We see that the confidence in these numbers have increased and there is no noticeable regressions. I used the same SHAs for both runs with the only difference that the latter had I'm proposing the following adjustment for the next time we do this:
One might argue that it's a poor benchmark choice if a single iteration takes more than a second to run, but we have what we have. Thoughts? cc @nvanbenschoten @srosenberg |
cc @cockroachdb/test-eng |
Right. This particular benchmark is hardly a micro one. It definitely makes sense to run more iterations, but maybe we should increase the number of iterations instead of duration? or both? This way we could at least ensure there is a lower bound on the number of iterations. Note that even increasing the number of iterations yields a couple of different options, e.g., Run for 1 second (default)
Run for 5 seconds (i.e., run as many iterations as possible)
Run 5 times (i.e., each run is independent and uses the default duration of 1 second),
Run 5 iterations (i.e., benchmark function is executed exactly
|
Clearly, there isn't a "magic" option which will result in lower variance. Microbenchmarks are notorious for having high variance. E.g., in [1], the authors surveyed a large number of microbenchmarks executed in the cloud; their conclusion was to use at least 20 instances (i.e., VMs) to find slowdowns with high confidence,
So, we should aim to increase the number of independent runs as well as iterations. @herkolategan Let's capture these notes in the umbrella issue for automating bench runs in CI. [1] https://www.ifi.uzh.ch/dam/jcr:326f9543-d719-4e8c-a27e-d5a5cace1abf/emse_smb_cloud.pdf |
Signing off on the |
Signing off on all |
89461: execinfra: remove possible logging for each output row of processors r=yuzefovich a=yuzefovich This commit removes possible logging (hidden behind level 3 verbosity) for each row that flows through `ProcessRowHelper` (which effectively all processors use). The verbosity check itself has non-trivial performance cost in this case. I think it's ok to remove it given that this logging was commented out until ea559df, and I don't recall ever having a desire to see all of the rows. ``` name old time/op new time/op delta Noop/cols=1-24 907µs ± 0% 749µs ± 0% -17.43% (p=0.000 n=9+9) Noop/cols=2-24 906µs ± 0% 748µs ± 0% -17.44% (p=0.000 n=9+10) Noop/cols=4-24 908µs ± 0% 748µs ± 0% -17.64% (p=0.000 n=10+9) Noop/cols=16-24 908µs ± 0% 749µs ± 0% -17.57% (p=0.000 n=10+10) Noop/cols=256-24 911µs ± 0% 751µs ± 0% -17.50% (p=0.000 n=10+10) name old speed new speed delta Noop/cols=1-24 578MB/s ± 0% 700MB/s ± 0% +21.12% (p=0.000 n=9+9) Noop/cols=2-24 1.16GB/s ± 0% 1.40GB/s ± 0% +21.13% (p=0.000 n=9+10) Noop/cols=4-24 2.31GB/s ± 0% 2.80GB/s ± 0% +21.41% (p=0.000 n=10+9) Noop/cols=16-24 9.24GB/s ± 0% 11.20GB/s ± 0% +21.32% (p=0.000 n=10+10) Noop/cols=256-24 147GB/s ± 0% 179GB/s ± 0% +21.22% (p=0.000 n=10+10) name old alloc/op new alloc/op delta Noop/cols=1-24 1.45kB ± 0% 1.45kB ± 0% -0.07% (p=0.000 n=10+10) Noop/cols=2-24 1.46kB ± 0% 1.46kB ± 0% -0.07% (p=0.000 n=10+9) Noop/cols=4-24 1.47kB ± 0% 1.47kB ± 0% -0.07% (p=0.000 n=10+10) Noop/cols=16-24 1.57kB ± 0% 1.57kB ± 0% -0.06% (p=0.000 n=10+10) Noop/cols=256-24 3.49kB ± 0% 3.49kB ± 0% -0.03% (p=0.000 n=10+10) name old allocs/op new allocs/op delta Noop/cols=1-24 5.00 ± 0% 5.00 ± 0% ~ (all equal) Noop/cols=2-24 5.00 ± 0% 5.00 ± 0% ~ (all equal) Noop/cols=4-24 5.00 ± 0% 5.00 ± 0% ~ (all equal) Noop/cols=16-24 5.00 ± 0% 5.00 ± 0% ~ (all equal) Noop/cols=256-24 5.00 ± 0% 5.00 ± 0% ~ (all equal) ``` Addresses: #87685. Release note: None Co-authored-by: Yahor Yuzefovich <[email protected]>
Should we close this out now that 22.2 has been cut? |
This is a tracking bug for issues we want to investigate related to performance for 22.1 similar to what's been done for prior releases (#78592).
SQL
spread sheets
NOTE:
importer
was skipped due to a panic and other errors.spread sheets
spread sheets
KV
./pkg/kv/kvserver/gc
./pkg/kv/kvserver/raftentry
./pkg/kv/kvserver/rangefeed
./pkg/kv/kvserver/spanlatch
./pkg/kv/kvserver/tscache
spread sheets
./pkg/roachpb
./pkg/rpc/...
./pkg/server/...
spread sheets
NOTE:
BenchmarkGRPCDial
failed withcontext_test.go:1936: initial connection heartbeat failed: rpc error: code = Unimplemented desc = unknown service cockroach.rpc.Heartbeat
Bulk I/O
spread sheets
NOTE: The results are empty because the benches don't suppress
stdout
causing results to be on subsequent linesNOTE: Same issue as in #78592.
Update
NOTE: Managed to get output from
pkg/blobs
by temporarily applying Stan's fix (release: 22.1 microbenchmark regression investigation and sign-off #78592 (comment))Storage
spread sheets
spread sheets
NOTE: Various names for these Benchmarks have changed between revisions resulting in omission from the spread sheet.
Misc
spread sheets
NOTE: pkg/cli had to be skipped because it failed to build in the old version,
Benchmark notes:
dev
supports setting up the correctgo
version for a revision,benchdiff
does not and still needs to build the benchmark binaries. Thus a post checkout script is required to switch to the appropriatego
version for each revision (only applies if the revisions require differentgo
versions).pkg/storage
is more resource intensive than the other benchmarks and requires at least 200 GB of available disk space as well as 64 GB memory.pkg/storage
also exceed the default 1024 file limit per user, which needs to be adjusted for the benchmark to complete.Issues:
testdata
path issues when running frombenchdiff
(BenchmarkOCFImport
,BenchmarkBinaryJSONImport
)BenchmarkConvertToKVs
Jira issue: CRDB-19475
The text was updated successfully, but these errors were encountered: