Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kvcoord: ReturnOnRangeBoundary allocs regression #99027

Closed
nvanbenschoten opened this issue Mar 20, 2023 · 6 comments
Closed

kvcoord: ReturnOnRangeBoundary allocs regression #99027

nvanbenschoten opened this issue Mar 20, 2023 · 6 comments
Assignees
Labels
A-kv Anything in KV that doesn't belong in a more specific category. branch-master Failures and bugs on the master branch. branch-release-23.1 Used to mark GA and release blockers, technical advisories, and bugs for 23.1 C-performance Perf of queries or internals. Solution not expected to change functional behavior. GA-blocker T-kv KV Team

Comments

@nvanbenschoten
Copy link
Member

nvanbenschoten commented Mar 20, 2023

See #98068 and https://docs.google.com/spreadsheets/d/10GhYr_91CANCNKOM_gPy7Sk9hQkTyQGNgNwNgfHeUtI/edit#gid=4.

benchdiff --old=81a114c --new=master --post-checkout='dev generate go' --run=BenchmarkReturnOnRangeBoundary ./pkg/kv/kvclient/kvcoord

Jira issue: CRDB-25668

@nvanbenschoten nvanbenschoten added C-performance Perf of queries or internals. Solution not expected to change functional behavior. A-kv Anything in KV that doesn't belong in a more specific category. branch-master Failures and bugs on the master branch. GA-blocker T-kv KV Team branch-release-23.1 Used to mark GA and release blockers, technical advisories, and bugs for 23.1 labels Mar 20, 2023
@nvanbenschoten
Copy link
Member Author

The regression in allocs/op looks real:

$ benchdiff --old=81a114c --new=master --post-checkout='dev generate go' --run=BenchmarkReturnOnRangeBoundary ./pkg/kv/kvclient/kvcoord
test binaries already exist for '81a114c'; skipping build
test binaries already exist for '85c6e38'; skipping build

  pkg=1/1 iter=10/10 cockroachdb/cockroach/pkg/kv/kvclient/kvcoord |

name                      old time/op    new time/op    delta
ReturnOnRangeBoundary-30     346µs ± 2%     366µs ± 2%   +5.83%  (p=0.000 n=10+10)

name                      old alloc/op   new alloc/op   delta
ReturnOnRangeBoundary-30    86.1kB ± 0%    91.1kB ± 0%   +5.82%  (p=0.000 n=10+10)

name                      old allocs/op  new allocs/op  delta
ReturnOnRangeBoundary-30       663 ± 0%       729 ± 0%  +10.02%  (p=0.000 n=8+10)

@kvoli
Copy link
Collaborator

kvoli commented Apr 11, 2023

Reproduced the regression on master:

$ benchdiff --old=81a114c --new=master --post-checkout='rm -rf pkg/ui/yarn-vendor && rm -rf vendor && dev gen && go mod vendor' --run=BenchmarkReturnOnRangeBoundary --memprofile --count=100 ./pkg/kv/kvclient/kvcoord
test binaries already exist for '81a114c'; skipping build
test binaries already exist for 'a5d61d2'; skipping build

  pkg=1/1 iter=100/100 cockroachdb/cockroach/pkg/kv/kvclient/kvcoord |
1
name                      old time/op    new time/op    delta
ReturnOnRangeBoundary-16     670µs ± 4%     721µs ± 6%   +7.54%  (p=0.000 n=94+95)

name                      old alloc/op   new alloc/op   delta
ReturnOnRangeBoundary-16    90.4kB ± 1%    98.4kB ± 4%   +8.88%  (p=0.000 n=94+97)

name                      old allocs/op  new allocs/op  delta
ReturnOnRangeBoundary-16       721 ± 1%       845 ± 6%  +17.20%  (p=0.000 n=94+97)

return_on_range_boundary-mem.zip

However I'm not seeing an increase when checking alloc_objets with:

go tool pprof -http localhost:8083 -diff_base ./benchdiff/81a114c/artifacts/mem.prof ./benchdiff/a5d61d2/artifacts/mem.prof

Actually the opposite:

image

For space, perhaps the batch request shallow copy, however again it appears mostly better on the request path. There is certainly more SQL overhead however.

image

@kvoli
Copy link
Collaborator

kvoli commented Apr 18, 2023

@nvanbenschoten could you double check my interpretation in the comment above? I'm not seeing the regression in the profiles to match the benchmarked regression.

@nvanbenschoten
Copy link
Member Author

@kvoli I was similarly unable to parse these heap profiles and make sense of them. There did seem to be some regression in heap allocations within Prometheus histograms, but it was inconclusive.

I bisected a regression in heap allocations and came to d6d4ccb. However, that's a known regression which was fixed in ca76c28. After the fix, the regression in allocs/op appeared to be more gradual.

The regression also did not seem to be entirely synchronous with the benchmark loop. The heap profile shows this. I also saw this by sleeping for 5 seconds after server startup — heap allocs dropped by about 10%. Given that we're not seeing this regression show up on macro-benchmarks, maybe we just take one more pass at this and then close it out?

@kvoli
Copy link
Collaborator

kvoli commented Apr 19, 2023

Sounds good, I'll take one more look through and close it out if nothing new arises.

@kvoli
Copy link
Collaborator

kvoli commented Apr 19, 2023

I took another stab at running and glancing at profiles.

The regression also did not seem to be entirely synchronous with the benchmark loop.

It would be nice if there were some way to exclude these later allocs from the benchmark. Also, excluding from the profile. In any case, I'm still seeing the same regression when using the 23.1.0 release branch commit whilst not seeing a regression in the profile.

Closing.

@kvoli kvoli closed this as completed Apr 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-kv Anything in KV that doesn't belong in a more specific category. branch-master Failures and bugs on the master branch. branch-release-23.1 Used to mark GA and release blockers, technical advisories, and bugs for 23.1 C-performance Perf of queries or internals. Solution not expected to change functional behavior. GA-blocker T-kv KV Team
Projects
None yet
Development

No branches or pull requests

2 participants