scatter: downreplication takes a long time #17000

benesch · 2017-07-12T14:53:37Z

Discovered while implementing a proper scatter in #16249. AdminScatter is able to very quickly upreplicate (e.g., in a few minutes), but the replicate queue can take ten minutes to downreplicate. Figure out why; @a-robinson claims the downreplication should take approximately as long as the upreplication.

This behavior is easy to reproduce with a four-node local cluster with preeexisting data: https://github.com/benesch/crdb-playground-scatter

The text was updated successfully, but these errors were encountered:

a-robinson · 2017-07-17T15:06:30Z

Was this fixed by your DeleteRange change?

benesch · 2017-07-17T18:47:53Z

Don't think so, but you'd know better. Here's the downreplication after a 15GB restore with ~1k ranges.

Takes about 10m.

benesch · 2017-07-17T18:48:33Z

And here's a screenshot that's, like, actually useful:

benesch · 2017-07-17T18:49:01Z

Try not to look too hard at the leaseholders per store graph.

a-robinson · 2017-07-17T19:12:21Z

Assuming I'm squinting properly at that graph, it looks like it took around 7 minutes, so I guess it's still not fixed.

benesch · 2017-07-17T19:13:23Z

Yeah, sorry, it's ~10m. How long would you expect it to take?

a-robinson · 2017-07-17T19:30:32Z

About the same amount of time as up-replicating. The problem, though, is that's the wrong graph to be looking at, since it's including not just the down-replication time, but also the replica GC time, since the metric behind the replicas-per-store graph only gets decremented when a replica gets GC'ed, not when it gets removed from the consensus group. Usually those events are close together, but they aren't guaranteed to be. The "Range Operations" graph would be more useful.

I'll check it out sometime, but it doesn't seem urgent.

benesch · 2017-07-17T20:44:58Z

That is incredibly useful to know. Yeah, no rush at all; I may look into it too.

tbg · 2018-10-11T10:56:52Z

I wouldn't be surprised if this were fixed now. Don't the decommissioning tests verify something similar?

a-robinson · 2018-10-11T18:36:49Z

Scatter no longer scatters replicas, and when we bring that aspect of it back (#26438) it won't add all replicas before removing any. I think it's safe to close this.

benesch self-assigned this Jul 12, 2017

benesch mentioned this issue Jul 12, 2017

storage: teach scatter to use the allocator and zone config #16249

Merged

a-robinson self-assigned this Jul 12, 2017

a-robinson added this to the Later milestone Jul 17, 2017

tbg added the A-kv-replication Relating to Raft, consensus, and coordination. label May 15, 2018

tbg added the C-cleanup Tech debt, refactors, loose ends, etc. Solution not expected to significantly change behavior. label Jul 22, 2018

tbg mentioned this issue Jul 22, 2018

Over-replicated ranges after zone config change #18911

Closed

petermattis removed this from the Later milestone Oct 5, 2018

a-robinson closed this as completed Oct 11, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

scatter: downreplication takes a long time #17000

scatter: downreplication takes a long time #17000

benesch commented Jul 12, 2017

a-robinson commented Jul 17, 2017

benesch commented Jul 17, 2017

benesch commented Jul 17, 2017

benesch commented Jul 17, 2017

a-robinson commented Jul 17, 2017

benesch commented Jul 17, 2017

a-robinson commented Jul 17, 2017

benesch commented Jul 17, 2017

tbg commented Oct 11, 2018

a-robinson commented Oct 11, 2018

scatter: downreplication takes a long time #17000

scatter: downreplication takes a long time #17000

Comments

benesch commented Jul 12, 2017

a-robinson commented Jul 17, 2017

benesch commented Jul 17, 2017

benesch commented Jul 17, 2017

benesch commented Jul 17, 2017

a-robinson commented Jul 17, 2017

benesch commented Jul 17, 2017

a-robinson commented Jul 17, 2017

benesch commented Jul 17, 2017

tbg commented Oct 11, 2018

a-robinson commented Oct 11, 2018