Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storage: unmarshaling descriptors makes 35% of garbage on cluster with many ranges #21702

Closed
jordanlewis opened this issue Jan 23, 2018 · 3 comments
Assignees
Labels
C-performance Perf of queries or internals. Solution not expected to change functional behavior.
Milestone

Comments

@jordanlewis
Copy link
Member

I noticed this while playing with the TPCC-1000 dataset against a cluster that exhibited #21689, which creates a ton of ranges. The cluster had around 50,000 ranges.

The metrics collector is supposed to iterate over every range every 10 seconds, collecting stats on it. One of the steps to collecting stats is to check the range zone config to see how many replicas it's supposed to have. Checking the zone config requires demarshalling the range's table descriptor and the table's zone config, if it has one, which is the source of the garbage.

The reason that this produces so much garbage in the end is that there's no caching for any of these lookups, so each of the 50,000 ranges causes a full table descriptor unmarshal.

We should figure out a way to fix this. Here are some ideas:

  1. Cache the results of getZoneConfigForID, which is a map from descriptor ID to zone config. The challenge here would be determining the correct eviction policy, its relation to descriptor changes from gossip, and so on.
  2. Simpler would be to memoize the results of the above call just for the lifetime of the scan over all of the ranges. This way I think we could avoid thinking about eviction.
  3. @benesch suggested adding an index to the system.namespace table on its config id column, which would permit looking up whether a range has a specialized zone config for its table or partition without having to demarshal its descriptor to check.

I uploaded the profiles I used to find this issue below. You can see that this ComputeMetrics flow takes 20% of the CPU of the cluster, which was at the time of profiling running a large COUNT(*). @danhhz rightly points out that you can't conclude much there since the cluster wasn't fully loaded, but it's still concerning in my opinion.

pprof.alloc_objects.alloc_space.inuse_objects.inuse_space.tpcc-1000-1node-count-stock.pb.gz
pprof.cockroach.samples.cpu.tpcc-1000-1node-count-stock.pb.gz

I think it would be great to get this fixed by 2.0. I can't prove it, but I suspect that this issue is one of the reasons why idle clusters eat up a lot of CPU, something that several users have asked about.

@jordanlewis jordanlewis added this to the 2.0 milestone Jan 23, 2018
@jordanlewis jordanlewis changed the title config: demarshalling ZoneConfigs makes 35% of garbage on cluster with many ranges config: demarshalling descriptors makes 35% of garbage on cluster with many ranges Jan 23, 2018
@nvanbenschoten nvanbenschoten self-assigned this Jan 23, 2018
@nvanbenschoten
Copy link
Member

I'll take this issue. Thanks for the detailed write-up!

@jordanlewis jordanlewis modified the milestones: 2.0, 2.1 Feb 22, 2018
@petermattis petermattis changed the title config: demarshalling descriptors makes 35% of garbage on cluster with many ranges storage: unmarshaling descriptors makes 35% of garbage on cluster with many ranges Jul 21, 2018
@petermattis petermattis added C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) A-kv-replication Relating to Raft, consensus, and coordination. labels Jul 21, 2018
@petermattis
Copy link
Collaborator

I noticed this last week when running the kv/splits/nodes=3 roachtest. Unmarshaling of zone configs showed up prominently on a cpu profile. There might be an easy win here.

@tbg tbg added C-performance Perf of queries or internals. Solution not expected to change functional behavior. A-coreperf and removed A-kv-replication Relating to Raft, consensus, and coordination. C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) labels Jul 31, 2018
@spencerkimball spencerkimball self-assigned this Aug 22, 2018
@spencerkimball
Copy link
Member

I'm working on this btw.

spencerkimball added a commit to spencerkimball/cockroach that referenced this issue Sep 12, 2018
…ization

This change allows a `SystemConfig` to cache object ID -> `ZoneConfig`
lookups to avoid repetitively deserializing table descriptors and zone
configs for every replica in the system. Replicas themselves now also
cache zone config entries to avoid maintaining per-replica min/max bytes
and for making future use of additional zone config parameters simpler
(e.g. zone config GC policies, number of replicas, etc.).

Fixes cockroachdb#21702
spencerkimball added a commit to spencerkimball/cockroach that referenced this issue Sep 12, 2018
…ization

This change allows a `SystemConfig` to cache object ID -> `ZoneConfig`
lookups to avoid repetitively deserializing table descriptors and zone
configs for every replica in the system. Replicas themselves now also
cache zone config entries to avoid maintaining per-replica min/max bytes
and for making future use of additional zone config parameters simpler
(e.g. zone config GC policies, number of replicas, etc.).

Fixes cockroachdb#21702
spencerkimball added a commit to spencerkimball/cockroach that referenced this issue Sep 18, 2018
…ization

This change allows a `SystemConfig` to cache object ID -> `ZoneConfig`
lookups to avoid repetitively deserializing table descriptors and zone
configs for every replica in the system. Replicas themselves now also
cache zone config entries to avoid maintaining per-replica min/max bytes
and for making future use of additional zone config parameters simpler
(e.g. zone config GC policies, number of replicas, etc.).

Fixes cockroachdb#21702
spencerkimball added a commit to spencerkimball/cockroach that referenced this issue Sep 19, 2018
…ization

This change allows a `SystemConfig` to cache object ID -> `ZoneConfig`
lookups to avoid repetitively deserializing table descriptors and zone
configs for every replica in the system. Replicas themselves now also
cache zone config entries to avoid maintaining per-replica min/max bytes
and for making future use of additional zone config parameters simpler
(e.g. zone config GC policies, number of replicas, etc.).

Fixes cockroachdb#21702
craig bot pushed a commit that referenced this issue Sep 19, 2018
30143: config/storage: cache zone config values to avoid repetitive deserialization r=spencerkimball a=spencerkimball

This change allows a `SystemConfig` to cache object ID -> `ZoneConfig`
lookups to avoid repetitively deserializing table descriptors and zone
configs for every replica in the system. Replicas themselves now also
cache zone config entries to avoid maintaining per-replica min/max bytes
and for making future use of additional zone config parameters simpler
(e.g. zone config GC policies, number of replicas, etc.).

Fixes #21702

Co-authored-by: Spencer Kimball <[email protected]>
@craig craig bot closed this as completed in #30143 Sep 19, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-performance Perf of queries or internals. Solution not expected to change functional behavior.
Projects
None yet
Development

No branches or pull requests

5 participants