storage: parallelize processing of Replica.handleRaftReady #7531

petermattis · 2016-06-29T19:22:47Z

Improves the performance of block_writer inserting 1m rows from 250s
to 195s. We default concurrency per store to 2*num-cpu. There wasn't
any noticeable harm or benefit to allowing more concurrency than that,
but it seems prudent to provide some limit.

Fixes #7522.

This change is

tbg · 2016-06-29T19:27:08Z

storage/store.go

-					panic(err) // TODO(bdarnell)
-				}
+				sem.acquire()
+				go func(r *Replica) {


If you're avoiding the stopper on purpose, a comment wouldn't hurt as to not make this look like an omission.

Well, we weren't previously checking the stopper in this code, but instead processing all of the replicas. I'm not sure why would need to check the stopper with this change.

tbg · 2016-06-29T19:27:22Z

LGTM, but need to make sure @bdarnell takes a look when he comes back.

tamird · 2016-06-29T19:40:31Z

Reviewed 1 of 1 files at r1.
Review status: all files reviewed at latest revision, 3 unresolved discussions, all commit checks successful.

storage/store.go, line 91 [r1] (raw file):

}

// The amount of concurrency to allow in processing replicas on per-store

is this a useful comment?

storage/store.go, line 114 [r1] (raw file):

func makeSemaphore(n int) semaphore {
  s := make(semaphore, n)
  for i := 0; i < n; i++ {

i don't think it's crucial, but this can be removed if you invert the implementations of acquire and release. By the way, wouldn't this be faster with atomic operations? asking for a friend.

Comments from Reviewable

petermattis · 2016-06-29T20:28:10Z

Review status: all files reviewed at latest revision, 3 unresolved discussions, all commit checks successful.

storage/store.go, line 91 [r1] (raw file):

Previously, tamird (Tamir Duberstein) wrote…

is this a useful comment?

Without it, wouldn't the reader potentially be confused about what concurrency this variable was controlling.

storage/store.go, line 114 [r1] (raw file):

Previously, tamird (Tamir Duberstein) wrote…

i don't think it's crucial, but this can be removed if you invert the implementations of acquire and release. By the way, wouldn't this be faster with atomic operations? asking for a friend.

How would you do this with atomic operations?

Comments from Reviewable

tamird · 2016-06-29T20:32:41Z

Reviewed 1 of 1 files at r2.
Review status: all files reviewed at latest revision, 3 unresolved discussions, some commit checks pending.

storage/store.go, line 91 [r1] (raw file):

Previously, petermattis (Peter Mattis) wrote…

Without it, wouldn't the reader potentially be confused about what concurrency this variable was controlling.

I suppose. I'd prefer to elongate the variable name, but that's just me. Perhaps `storeReplicaRaftReadyConcurrency`.

storage/store.go, line 114 [r1] (raw file):

Previously, petermattis (Peter Mattis) wrote…

How would you do this with atomic operations?

Derp, didn't think this through.

Comments from Reviewable

Improves the performance of `block_writer` inserting 1m rows from 250s to 195s. We default concurrency per store to `2*num-cpu`. There wasn't any noticeable harm or benefit to allowing more concurrency than that, but it seems prudent to provide some limit. Fixes cockroachdb#7522.

petermattis · 2016-06-29T21:01:23Z

Review status: 0 of 1 files reviewed at latest revision, 2 unresolved discussions, some commit checks pending.

storage/store.go, line 91 [r1] (raw file):

Previously, tamird (Tamir Duberstein) wrote…

I suppose. I'd prefer to elongate the variable name, but that's just me. Perhaps storeReplicaRaftReadyConcurrency.

Given the infrequent usage, sounds reasonable. Done.

Comments from Reviewable

tamird · 2016-06-29T21:04:11Z

Reviewed 1 of 1 files at r3.
Review status: all files reviewed at latest revision, 1 unresolved discussion, some commit checks pending.

Comments from Reviewable

bdarnell · 2016-07-01T20:27:21Z

That was easier than I was expecting; hope we're not missing something. (My more complex plan was to avoid the WaitGroup and give each Replica its own goroutine that wouldn't block the processRaft goroutine. But this looks like it may give us most of the benefit more easily)

Review status: all files reviewed at latest revision, 2 unresolved discussions, all commit checks successful.

storage/store.go, line 2193 [r1] (raw file):

Previously, petermattis (Peter Mattis) wrote…

Well, we weren't previously checking the stopper in this code, but instead processing all of the replicas. I'm not sure why would need to check the stopper with this change.

In general we try to avoid bare goroutines, but in this case it's safe to not use the stopper because the WaitGroup ensures they won't be left dangling. The Stopper could still save a bit of code since RunLimitedAsyncTask has implemented the semaphore logic, but there would be a small amount of overhead in using it.

storage/store.go, line 91 [r3] (raw file):

}

var storeReplicaRaftReadyConcurrency = 2 * runtime.NumCPU()

The theory behind the single thread per store was that we're ultimately I/O bound and that more threads would just compete for the same I/O resources even if they're able to make use of more CPUs. The optimal limit here may be lower rather than higher.

Comments from Reviewable

petermattis · 2016-07-01T21:36:32Z

Review status: all files reviewed at latest revision, 2 unresolved discussions, all commit checks successful.

storage/store.go, line 2193 [r1] (raw file):

Previously, bdarnell (Ben Darnell) wrote…

In general we try to avoid bare goroutines, but in this case it's safe to not use the stopper because the WaitGroup ensures they won't be left dangling. The Stopper could still save a bit of code since RunLimitedAsyncTask has implemented the semaphore logic, but there would be a small amount of overhead in using it.

The `WaitGroup` seemed simple enough that I'd prefer not to change it until there is a need.

storage/store.go, line 91 [r3] (raw file):

Previously, bdarnell (Ben Darnell) wrote…

The theory behind the single thread per store was that we're ultimately I/O bound and that more threads would just compete for the same I/O resources even if they're able to make use of more CPUs. The optimal limit here may be lower rather than higher.

Yes, it is true that more threads would just compete for the same I/O resources. Yet we're often hitting the RockDB cache which doesn't have the same limits. On flash, the ops/sec is very high. And for writes it is beneficial to do multiple concurrently so that the WAL syncs can be batched. That said, it is very likely that this knob should be tuned. I'll file an issue so that we don't forget about it.

Comments from Reviewable

tbg reviewed Jun 29, 2016
View reviewed changes

petermattis force-pushed the pmattis/parallel-handle-raft-ready branch from 437185e to ed0937c Compare June 29, 2016 20:28

petermattis force-pushed the pmattis/parallel-handle-raft-ready branch from ed0937c to 5b04a8f Compare June 29, 2016 21:01

petermattis merged commit a97d757 into cockroachdb:master Jun 29, 2016

petermattis deleted the pmattis/parallel-handle-raft-ready branch June 29, 2016 21:27

petermattis mentioned this pull request Jul 1, 2016

perf: block_writer performance degradation #7594

Closed

tamird mentioned this pull request Jul 2, 2016

perf: impact of having 1 rocksdb instance / device #3900

Closed

tbg mentioned this pull request Jul 2, 2016

storage: clobbering between uninitialized replica and RHS of a split #7600

Closed

bdarnell mentioned this pull request Jul 7, 2016

beta-20160707 #7669

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

storage: parallelize processing of Replica.handleRaftReady #7531

storage: parallelize processing of Replica.handleRaftReady #7531

petermattis commented Jun 29, 2016 •

edited

Loading

tbg Jun 29, 2016

petermattis Jun 29, 2016

tbg commented Jun 29, 2016 •

edited

Loading

tamird commented Jun 29, 2016

petermattis commented Jun 29, 2016

tamird commented Jun 29, 2016

petermattis commented Jun 29, 2016

tamird commented Jun 29, 2016

bdarnell commented Jul 1, 2016

petermattis commented Jul 1, 2016

storage: parallelize processing of Replica.handleRaftReady #7531

storage: parallelize processing of Replica.handleRaftReady #7531

Conversation

petermattis commented Jun 29, 2016 • edited Loading

tbg Jun 29, 2016

Choose a reason for hiding this comment

petermattis Jun 29, 2016

Choose a reason for hiding this comment

tbg commented Jun 29, 2016 • edited Loading

tamird commented Jun 29, 2016

petermattis commented Jun 29, 2016

tamird commented Jun 29, 2016

petermattis commented Jun 29, 2016

tamird commented Jun 29, 2016

bdarnell commented Jul 1, 2016

petermattis commented Jul 1, 2016

petermattis commented Jun 29, 2016 •

edited

Loading

tbg commented Jun 29, 2016 •

edited

Loading