kvserver: replicate queue should more aggressively upreplicate #79318

erikgrinaker · 2022-04-04T10:43:40Z

We've often seen the replicate queue being very slow to process underreplicated ranges in large clusters (up to 1 million ranges).

The replica scanner is responsible for enqueueing replicas through the queue, with a target of 10 minutes per pass, and a minimum 10 milliseconds between each replica, as well as the queue processing time. It does not take replica states into account at all.

We need to make sure that the replica queue will prioritize underreplicated ranges and aggressively try to upreplicate them.

Jira issue: CRDB-14689

mwang1026 · 2022-04-25T18:58:45Z

FYI @lidorcarmel @kvoli

kvoli · 2022-04-25T19:27:12Z

I'm adding in a roachperf benchmark for this issue - ref #79940 and #80383.

We currently take priority into account, with actions associated with up replication reasonable high up.

https://github.com/kvoli/cockroach/blob/4380161366180cbab17c1d4c70e2951f08e179fa/pkg/kv/kvserver/allocator.go#L147-L181

This affects the next replica to process in the base queue:

https://github.com/kvoli/cockroach/blob/4380161366180cbab17c1d4c70e2951f08e179fa/pkg/kv/kvserver/queue.go#L1219-L1222

Do you think the issue is cadence of the replicate queue? i.e. it's blocking single consumer where it may take a while to actually perform each action.

Are we processing replicas faster than queueing here? It may be that we are limited by the processing rate during up-replication. I'll look further into the queue length on the benchmark.

lidorcarmel · 2022-04-30T03:55:17Z

related #79453

erikgrinaker · 2022-05-12T11:45:12Z

Thanks for looking into this @kvoli. I'm not very familiar with the details here, but I think the priority only applies to replicas that are already added to the queue. However, replicas are only added to the queue every 10 minutes, either in random order or ordered by range ID:

cockroach/pkg/kv/kvserver/scanner.go

Lines 234 to 236 in ea9ca8f

    
           for _, q := range rs.queues { 
        
           	q.MaybeAddAsync(ctx, repl, rs.clock.NowAsClockTimestamp()) 
        
           }

cockroach/pkg/kv/kvserver/scanner.go

Lines 281 to 285 in ea9ca8f

    
           rs.replicas.Visit(func(repl *Replica) bool { 
        
           	count++ 
        
           	shouldStop = rs.waitAndProcess(ctx, start, repl) 
        
           	return !shouldStop 
        
           })

So I suppose the priority would only come into play when there is a queue backlog, and then only for the replicas that are in the backlog. So I think it could take up to 10 minutes even with those priorities?

kvoli · 2022-05-13T20:01:38Z

So I suppose the priority would only come into play when there is a queue backlog, and then only for the replicas that are in the backlog. So I think it could take up to 10 minutes even with those priorities?

That seems right. I believe this issue is also faced in decommissioning - as @lidorcarmel linked above.

Aayush's solution to retry replicas #81005 seems like a promising direction. Even then, the 10 minutes is still an issue.

Is there any reason why we couldn't "push" up-replication leaseholders into the queue? Given 10 minutes is a long tail for a worst case.

I can see where this pattern might devolve into multiple scanner cadences with different objectives if we don't have an event to trigger enqueuing, however at the moment the indiscriminate store scanner seems too blunt an instrument for different timeliness requirements.

github-actions · 2023-11-20T11:05:11Z

We have marked this issue as stale because it has been inactive for
18 months. If this issue is still relevant, removing the stale label
or adding a comment will keep it active. Otherwise, we'll close it in
10 days to keep the issue queue tidy. Thank you for your contribution
to CockroachDB!

erikgrinaker mentioned this issue Apr 5, 2022

kvserver: proactively enqueue replicas for a decommissioning node #79453

Closed

blathers-crl bot added the T-kv KV Team label Apr 6, 2022

erikgrinaker removed the T-kv-replication label Apr 6, 2022

jlinder added sync-me-3 and removed sync-me-3 labels May 24, 2022

github-actions bot added no-issue-activity and removed no-issue-activity labels Nov 20, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kvserver: replicate queue should more aggressively upreplicate #79318

kvserver: replicate queue should more aggressively upreplicate #79318

erikgrinaker commented Apr 4, 2022 •

edited by cockroach-jira-scripts

Loading

mwang1026 commented Apr 25, 2022

kvoli commented Apr 25, 2022

lidorcarmel commented Apr 30, 2022

erikgrinaker commented May 12, 2022

kvoli commented May 13, 2022

github-actions bot commented Nov 20, 2023

kvserver: replicate queue should more aggressively upreplicate #79318

kvserver: replicate queue should more aggressively upreplicate #79318

Comments

erikgrinaker commented Apr 4, 2022 • edited by cockroach-jira-scripts Loading

mwang1026 commented Apr 25, 2022

kvoli commented Apr 25, 2022

lidorcarmel commented Apr 30, 2022

erikgrinaker commented May 12, 2022

kvoli commented May 13, 2022

github-actions bot commented Nov 20, 2023

erikgrinaker commented Apr 4, 2022 •

edited by cockroach-jira-scripts

Loading