storage: replication reports generation takes 20% of a node's CPU for 200k ranges #41609

andreimatei · 2019-10-15T16:56:46Z

On a TPCC-100k run, with partitioning, the following profile shows report generation (at the left) taking a bunch of CPU. The generation of each report seems to take minutes.

We're unmarshalling the same zone protos over and over again.

andreimatei · 2019-10-17T00:41:23Z

I hope to have a fix for this tomorrow.

This patch is expected to speedup reports generation considerably by not unmarshalling zone config protos for every range. Instead, the report-geneating visitors now keep state around the zone that a visited range is in and reuse that state if they're told that the following range is in the same zone. The range iteration infrastructure was enhanced to figure out when zones change from range to range and tell that to the visitors. Without this patch, generating the reports was pinning a core for minutes for a cluster with partitioning and 200k ranges. The profiles showed that it's all zone config unmarshalling. Fixes cockroachdb#41609 Release note (performance improvement): The performance of generating the system.replication_* reports was greatly improved for large clusters.

This patch is expected to speedup reports generation considerably by not unmarshalling zone config protos for every range. Instead, the report-geneating visitors now keep state around the zone that a visited range is in and reuse that state if they're told that the following range is in the same zone. The range iteration infrastructure was enhanced to figure out when zones change from range to range and tell that to the visitors. visitRanges() now figures out that runs of consecutive ranges belong to the same zone. This is done through a new zoneResolver struct, that's optimized for the case where it's asked to resolve ranges in key order. Without this patch, generating the reports was pinning a core for minutes for a cluster with partitioning and 200k ranges. The profiles showed that it's all zone config unmarshalling. Fixes cockroachdb#41609 Release note (performance improvement): The performance of generating the system.replication_* reports was greatly improved for large clusters.

41684: importccl: add 19.2 version gate check r=miretskiy a=miretskiy Ensure the cluster is fully upgraded when running import. Release note: ensure cluster fully upgraded when running import. 41711: storage/report: don't deserialize zone configs over and over r=andreimatei a=andreimatei This patch is expected to speedup reports generation considerably by not unmarshalling zone config protos for every range. Instead, the report-geneating visitors now keep state around the zone that a visited range is in and reuse that state if they're told that the following range is in the same zone. The range iteration infrastructure was enhanced to figure out when zones change from range to range and tell that to the visitors. Without this patch, generating the reports was pinning a core for minutes for a cluster with partitioning and 200k ranges. The profiles showed that it's all zone config unmarshalling. Fixes #41609 Release note (performance improvement): The performance of generating the system.replication_* reports was greatly improved for large clusters. 41761: storage: write to local storage before updating liveness r=bdarnell a=irfansharif Previously a disk stall could allow a node to continue heartbeating its liveness record and prevent other nodes from taking over its leases, despite being completely unresponsive. This was first addressed in #24591 (+ #33122). This was undone in #32978 (which introduced a stricter version of a similar check). #32978 was later disabled by default in #36484, leaving us without the protections first introduced in #24591. This PR re-adds the logic from #24591. Part of #41683. Release note: None. Co-authored-by: Yevgeniy Miretskiy <[email protected]> Co-authored-by: Andrei Matei <[email protected]> Co-authored-by: irfan sharif <[email protected]>

This patch is expected to speedup reports generation considerably by not unmarshalling zone config protos for every range. Instead, the report-geneating visitors now keep state around the zone that a visited range is in and reuse that state if they're told that the following range is in the same zone. The range iteration infrastructure was enhanced to figure out when zones change from range to range and tell that to the visitors. visitRanges() now figures out that runs of consecutive ranges belong to the same zone. This is done through a new zoneResolver struct, that's optimized for the case where it's asked to resolve ranges in key order. Without this patch, generating the reports was pinning a core for minutes for a cluster with partitioning and 200k ranges. The profiles showed that it's all zone config unmarshalling. Fixes cockroachdb#41609 Release note (performance improvement): The performance of generating the system.replication_* reports was greatly improved for large clusters.

andreimatei added C-performance Perf of queries or internals. Solution not expected to change functional behavior. A-kv-replication Relating to Raft, consensus, and coordination. labels Oct 15, 2019

andreimatei self-assigned this Oct 15, 2019

andy-kimball mentioned this issue Oct 15, 2019

19.2 release blockers list #40447

Closed

53 tasks

andreimatei mentioned this issue Oct 17, 2019

storage/report: don't deserialize zone configs over and over #41711

Merged

andreimatei mentioned this issue Oct 21, 2019

release-19.2: storage/report: don't deserialize zone configs over and over #41757

Merged

craig bot closed this as completed in e854f99 Oct 21, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

storage: replication reports generation takes 20% of a node's CPU for 200k ranges #41609

storage: replication reports generation takes 20% of a node's CPU for 200k ranges #41609

andreimatei commented Oct 15, 2019

andreimatei commented Oct 17, 2019

storage: replication reports generation takes 20% of a node's CPU for 200k ranges #41609

storage: replication reports generation takes 20% of a node's CPU for 200k ranges #41609

Comments

andreimatei commented Oct 15, 2019

andreimatei commented Oct 17, 2019