Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql/catalog/descs: fix perf regression #72740

Merged
merged 3 commits into from
Nov 15, 2021

Conversation

ajwerner
Copy link
Contributor

This commit in #71936 had the unfortunate side-effect of allocating and forcing reads on the uncommittedDescriptors set even when we aren't looking for the system database. This has an outsized impact on the performance of the single-node, high-core-count KV runs. Instead of always initializing the system database, just do it when we access it.

name             old ops/s   new ops/s   delta
KV95-throughput  88.6k ± 0%  94.8k ± 1%   +7.00%  (p=0.008 n=5+5)

name             old ms/s    new ms/s    delta
KV95-P50          1.60 ± 0%   1.40 ± 0%  -12.50%  (p=0.008 n=5+5)
KV95-Avg          0.60 ± 0%   0.50 ± 0%  -16.67%  (p=0.008 n=5+5)

The second commit is more speculative and came from looking at a profile where 1.6% of the allocated garbage was due to that NameInfo even though we'll never, ever hit it.

Screen Shot 2021-11-15 at 12 57 31 AM

Fixes #72499

@ajwerner ajwerner requested review from postamar and a team November 15, 2021 05:58
@cockroach-teamcity
Copy link
Member

This change is Reviewable

```
name             old ops/s   new ops/s   delta
KV95-throughput  88.6k ± 0%  94.8k ± 1%   +7.00%  (p=0.008 n=5+5)

name             old ms/s    new ms/s    delta
KV95-P50          1.60 ± 0%   1.40 ± 0%  -12.50%  (p=0.008 n=5+5)
KV95-Avg          0.60 ± 0%   0.50 ± 0%  -16.67%  (p=0.008 n=5+5)
```

Fixes cockroachdb#72499.

Release note: None
@ajwerner ajwerner force-pushed the ajwerner/fix-perf-regression branch from f9eb756 to fdc1c88 Compare November 15, 2021 13:08
@ajwerner
Copy link
Contributor Author

With the second commit:

name             old ops/s   new ops/s   delta
KV95-throughput  88.7k ± 0%  95.7k ± 0%   +7.82%  (p=0.008 n=5+5)

name             old ms/s    new ms/s    delta
KV95-P50          1.60 ± 0%   1.40 ± 0%  -12.50%  (p=0.008 n=5+5)
KV95-Avg          0.60 ± 0%   0.50 ± 0%  -16.67%  (p=0.008 n=5+5)

This committ takes two steps to decrease lock contention. Firstly, it makes
the name cache data structure concurrent for readers with a RWLock. This
goes a pretty long way to diminish contention on the acquisition path. The
second change is to reduce the contention footprint when mucking with the
refcounts by doing less work in the common case.

Benchmark `kv95/nodes=1/cpu=32` delta over previous commit:
```
name             old ops/s   new ops/s   delta
KV95-throughput  96.4k ± 0%  99.6k ± 1%  +3.36%  (p=0.008 n=5+5)

name             old ms/s    new ms/s    delta
KV95-P50          1.40 ± 0%   1.40 ± 0%    ~     (all equal)
KV95-Avg          0.50 ± 0%   0.50 ± 0%    ~     (all equal)
```

Release note: None
Copy link
Collaborator

@fqazi fqazi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r1, 2 of 2 files at r2, 4 of 4 files at r3, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @postamar)

Copy link
Contributor

@postamar postamar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This all seems uncontroversially good to me.

A thought popped to mind after seeing your handling of the system db: this is undoubtedly worth it because the system db descriptor shows up in the uncommitted descriptor set in a great many transactions, but how about extending this treatment to all descriptors which are accessed only for reads? Such as things are, we always stuff them in the uncommitted descriptor set regardless if we want them mutable or not. It seems to me that we could break down the uncommitted layer in the collection into two sub-layers: a "cached immutable" layer and a real "uncommitted" layer. Reading a descriptor as mutable would evict it from the "cached immutable" layer. The latter could be pre-populated with the system db descriptor. Is there anything preventing us from doing this?

@ajwerner
Copy link
Contributor Author

Such as things are, we always stuff them in the uncommitted descriptor set regardless if we want them mutable or not. It seems to me that we could break down the uncommitted layer in the collection into two sub-layers: a "cached immutable" layer and a real "uncommitted" layer. Reading a descriptor as mutable would evict it from the "cached immutable" layer. The latter could be pre-populated with the system db descriptor. Is there anything preventing us from doing this?

Nothing preventing us from doing this. Will pull it into a separate PR.

@ajwerner
Copy link
Contributor Author

Flaked on #72718.

bors r+

@craig
Copy link
Contributor

craig bot commented Nov 15, 2021

Build succeeded:

@craig craig bot merged commit 47a2e2f into cockroachdb:master Nov 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

roachperf: kv95 and kv0 regression around October 28th
4 participants