-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
storage: unmarshaling descriptors makes 35% of garbage on cluster with many ranges #21702
Labels
C-performance
Perf of queries or internals. Solution not expected to change functional behavior.
Milestone
Comments
jordanlewis
changed the title
config: demarshalling ZoneConfigs makes 35% of garbage on cluster with many ranges
config: demarshalling descriptors makes 35% of garbage on cluster with many ranges
Jan 23, 2018
I'll take this issue. Thanks for the detailed write-up! |
petermattis
changed the title
config: demarshalling descriptors makes 35% of garbage on cluster with many ranges
storage: unmarshaling descriptors makes 35% of garbage on cluster with many ranges
Jul 21, 2018
petermattis
added
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
A-kv-replication
Relating to Raft, consensus, and coordination.
labels
Jul 21, 2018
I noticed this last week when running the |
tbg
added
C-performance
Perf of queries or internals. Solution not expected to change functional behavior.
A-coreperf
and removed
A-kv-replication
Relating to Raft, consensus, and coordination.
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
labels
Jul 31, 2018
I'm working on this btw. |
spencerkimball
added a commit
to spencerkimball/cockroach
that referenced
this issue
Sep 12, 2018
…ization This change allows a `SystemConfig` to cache object ID -> `ZoneConfig` lookups to avoid repetitively deserializing table descriptors and zone configs for every replica in the system. Replicas themselves now also cache zone config entries to avoid maintaining per-replica min/max bytes and for making future use of additional zone config parameters simpler (e.g. zone config GC policies, number of replicas, etc.). Fixes cockroachdb#21702
spencerkimball
added a commit
to spencerkimball/cockroach
that referenced
this issue
Sep 12, 2018
…ization This change allows a `SystemConfig` to cache object ID -> `ZoneConfig` lookups to avoid repetitively deserializing table descriptors and zone configs for every replica in the system. Replicas themselves now also cache zone config entries to avoid maintaining per-replica min/max bytes and for making future use of additional zone config parameters simpler (e.g. zone config GC policies, number of replicas, etc.). Fixes cockroachdb#21702
spencerkimball
added a commit
to spencerkimball/cockroach
that referenced
this issue
Sep 18, 2018
…ization This change allows a `SystemConfig` to cache object ID -> `ZoneConfig` lookups to avoid repetitively deserializing table descriptors and zone configs for every replica in the system. Replicas themselves now also cache zone config entries to avoid maintaining per-replica min/max bytes and for making future use of additional zone config parameters simpler (e.g. zone config GC policies, number of replicas, etc.). Fixes cockroachdb#21702
spencerkimball
added a commit
to spencerkimball/cockroach
that referenced
this issue
Sep 19, 2018
…ization This change allows a `SystemConfig` to cache object ID -> `ZoneConfig` lookups to avoid repetitively deserializing table descriptors and zone configs for every replica in the system. Replicas themselves now also cache zone config entries to avoid maintaining per-replica min/max bytes and for making future use of additional zone config parameters simpler (e.g. zone config GC policies, number of replicas, etc.). Fixes cockroachdb#21702
craig bot
pushed a commit
that referenced
this issue
Sep 19, 2018
30143: config/storage: cache zone config values to avoid repetitive deserialization r=spencerkimball a=spencerkimball This change allows a `SystemConfig` to cache object ID -> `ZoneConfig` lookups to avoid repetitively deserializing table descriptors and zone configs for every replica in the system. Replicas themselves now also cache zone config entries to avoid maintaining per-replica min/max bytes and for making future use of additional zone config parameters simpler (e.g. zone config GC policies, number of replicas, etc.). Fixes #21702 Co-authored-by: Spencer Kimball <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
C-performance
Perf of queries or internals. Solution not expected to change functional behavior.
I noticed this while playing with the TPCC-1000 dataset against a cluster that exhibited #21689, which creates a ton of ranges. The cluster had around 50,000 ranges.
The metrics collector is supposed to iterate over every range every 10 seconds, collecting stats on it. One of the steps to collecting stats is to check the range zone config to see how many replicas it's supposed to have. Checking the zone config requires demarshalling the range's table descriptor and the table's zone config, if it has one, which is the source of the garbage.
The reason that this produces so much garbage in the end is that there's no caching for any of these lookups, so each of the 50,000 ranges causes a full table descriptor unmarshal.
We should figure out a way to fix this. Here are some ideas:
getZoneConfigForID
, which is a map from descriptor ID to zone config. The challenge here would be determining the correct eviction policy, its relation to descriptor changes from gossip, and so on.system.namespace
table on its config id column, which would permit looking up whether a range has a specialized zone config for its table or partition without having to demarshal its descriptor to check.I uploaded the profiles I used to find this issue below. You can see that this ComputeMetrics flow takes 20% of the CPU of the cluster, which was at the time of profiling running a large
COUNT(*)
. @danhhz rightly points out that you can't conclude much there since the cluster wasn't fully loaded, but it's still concerning in my opinion.pprof.alloc_objects.alloc_space.inuse_objects.inuse_space.tpcc-1000-1node-count-stock.pb.gz
pprof.cockroach.samples.cpu.tpcc-1000-1node-count-stock.pb.gz
I think it would be great to get this fixed by 2.0. I can't prove it, but I suspect that this issue is one of the reasons why idle clusters eat up a lot of CPU, something that several users have asked about.
The text was updated successfully, but these errors were encountered: