backup/restore: revisit various rate limitings #14798

danhhz · 2017-04-11T16:50:35Z

In order to get BACKUP and RESTORE stabilized for 1.0, we had to add a few rate limits. It feels a bit like papering over the issues, so these should be revisited and better solutions found post-1.0.

Specifically:

Sending AdminSplit in presplitRanges for RESTORE
Sending Export and Import requests in BACKUP/RESTORE
Sending WriteBatch requests in Import

The text was updated successfully, but these errors were encountered:

We're already limiting these on the server-side, but the BACKUP/RESTORE gateway would fill up its distsender/grpc/something and cause all sorts of badness (node liveness timeouts leading to mass leaseholder transfers, poor performance on SQL workloads, etc) as well as log spam about slow distsender requests. There is likely some better fix post 1.0, this is being tracked in cockroachdb#14798. For cockroachdb#14792.

When syncing after every raft commit is on (the default recently switched to true in 801f3a0) then presplits on sufficiently large restores were behaving badly. I suspect this was overloading the disk, slowing down rocksdb commits (similar to the underlying cause addressed by our recent Import rate limiting changes). Unfortunately, this is not entirely confirmed by our metrics. Disk iops and usage were slightly elevated during the bad periods, but not really beyond what happens in a healthy cluster with block_writer traffic. However, reducing this constant from 100 to 20 clearly fixes the split badness in the 2tb restore acceptance test (while making the splits take 8 minutes, up from ~3). This is unfortunate, as it's possible to go faster in other circumstances, but there's already an issue to revisit this rate limiting post 1.0 (cockroachdb#14798).

This was largely motivated by how long it takes to presplit in very large restores. Doing them all upfront required rate-limiting, but no number was well-tuned for every cluster. Additionally, even at a high presplit rate of 100, a 10 TB cluster would take 52 minutes before it even started scattering. Now, one goroutine iterates through every span being imported, presplitting and scattering before moving on to the next one. Upon split+scatter, the span is sent into a buffered channel read by the Import goroutines, which prevents it from getting too far ahead of the Imports. This both acts as a natural rate limiter for the splits as well as bounds the number of empty ranges created if a RESTORE fails or is cancelled. Overall tpch-10 RESTORE time remains 12:30 on a 4-node cluster. Since each range is now scattered individually, we no longer need the jitter in the scatter implementation (plus it now slows down the RESTORE), so it's removed. Restore really needs a refactor, but I'm going to be making a couple more changes leading up to 1.1 so I'll leave cleanup until after they go in. This removes most tunable constants in RESTORE and the remaining ones are defined in terms of the number of nodes in the cluster and the number of cpus on a node, so this: Closes cockroachdb#14798.

danhhz added this to the Later milestone Apr 11, 2017

danhhz self-assigned this Apr 11, 2017

danhhz mentioned this issue Apr 11, 2017

sqlccl: rate limit Export and Import requests sent #14820

Merged

danhhz mentioned this issue Apr 19, 2017

storageccl: retry AmbiguousResultErrors from WriteBatch in Import #15175

Merged

danhhz mentioned this issue May 1, 2017

sqlccl: adjust the restore presplit rate down #15548

Merged

danhhz mentioned this issue Jul 21, 2017

sqlccl: rework split+scatter+import concurrency in RESTORE #17163

Merged

danhhz closed this as completed in #17163 Jul 25, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

backup/restore: revisit various rate limitings #14798

backup/restore: revisit various rate limitings #14798

danhhz commented Apr 11, 2017

backup/restore: revisit various rate limitings #14798

backup/restore: revisit various rate limitings #14798

Comments

danhhz commented Apr 11, 2017