Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storage/raft_transport: use a separate queue for each message type #7299

Merged
merged 1 commit into from
Jun 17, 2016
Merged

storage/raft_transport: use a separate queue for each message type #7299

merged 1 commit into from
Jun 17, 2016

Conversation

tamird
Copy link
Contributor

@tamird tamird commented Jun 16, 2016

@cuongdo can you try this to see if it mitigates #7258? If it does, I'll clean the code up some before merging.


This change is Reviewable

@bdarnell
Copy link
Contributor

:lgtm:


Review status: 0 of 1 files reviewed at latest revision, 1 unresolved discussion, some commit checks pending.


storage/raft_transport.go, line 218 [r1] (raw file):

  t.rpcContext.Stopper.RunTask(func() {
      t.rpcContext.Stopper.RunWorker(func() {
          // NB: only one error will ever be read from this channel. That's fine,

Remove this comment now that it's just a single error.


Comments from Reviewable

@cuongdo
Copy link
Contributor

cuongdo commented Jun 17, 2016

Nice! I see no unmarshal errors on any nodes using a custom build with your changes thus far. Previously, it was trivial to trigger this error with substantial snapshot load.

@tamird
Copy link
Contributor Author

tamird commented Jun 17, 2016

Great! Now if only I could figure out why TestSendAndReceive is hanging...

@tbg
Copy link
Member

tbg commented Jun 17, 2016

What is the change here that makes the error go away? Is code like this the problem?

stream1, _ := client.RaftMessage(ctx)
stream2, _ := client.RaftMessage(ctx)
// use stream{1,2} concurrently

@tamird
Copy link
Contributor Author

tamird commented Jun 17, 2016

Nah, it's that we were using stream2 concurrently - each snapshot was in an async task on the same stream.

@tbg
Copy link
Member

tbg commented Jun 17, 2016

gotcha, thanks.

On Thu, Jun 16, 2016 at 10:54 PM Tamir Duberstein [email protected]
wrote:

Nah, it's that we were using stream2 concurrently - each snapshot was in
an async task on the same stream.


You are receiving this because you commented.

Reply to this email directly, view it on GitHub
#7299 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/AE135Fyr49C-gPRpAVULLZyLjcA6Hvvjks5qMgxqgaJpZM4I33Zd
.

-- Tobias

@tamird
Copy link
Contributor Author

tamird commented Jun 17, 2016

OK, this is ready for another look.

@tamird tamird changed the title storage/raft_transport: use a separate queue for snapshots storage/raft_transport: use a separate queue for each message type Jun 17, 2016
@tamird
Copy link
Contributor Author

tamird commented Jun 17, 2016

Review status: 0 of 2 files reviewed at latest revision, 1 unresolved discussion, some commit checks pending.


storage/raft_transport.go, line 218 [r1] (raw file):

Previously, bdarnell (Ben Darnell) wrote…

Remove this comment now that it's just a single error.

Done.

Comments from Reviewable

@cuongdo
Copy link
Contributor

cuongdo commented Jun 17, 2016

Review status: 0 of 2 files reviewed at latest revision, 3 unresolved discussions, all commit checks successful.


storage/raft_transport.go, line 175 [r2] (raw file):

  ctx, cancel := context.WithCancel(context.TODO())
  defer cancel()

Why are you going back to always calling cancel() when this method returns? What if there are in-flight snapshots when a Raft transport error or idle timeout occurs?


storage/raft_transport.go, line 222 [r2] (raw file):

  if !ok {
      queues = make(map[roachpb.NodeID]chan *RaftMessageRequest)
      t.mu.queues[req.Message.Type] = queues

This assumes that there's never a case where two goroutines are trying to stream the same type of Raft message to the same node. Is that true in practice?


Comments from Reviewable

@tamird
Copy link
Contributor Author

tamird commented Jun 17, 2016

Review status: 0 of 2 files reviewed at latest revision, 3 unresolved discussions, all commit checks successful.


storage/raft_transport.go, line 175 [r2] (raw file):

Previously, cuongdo (Cuong Do) wrote…

Why are you going back to always calling cancel() when this method returns? What if there are in-flight snapshots when a Raft transport error or idle timeout occurs?

Nothing here is asynchronous any more; neither condition can occur while anything is in flight.

storage/raft_transport.go, line 222 [r2] (raw file):

Previously, cuongdo (Cuong Do) wrote…

This assumes that there's never a case where two goroutines are trying to stream the same type of Raft message to the same node. Is that true in practice?

Why does it assume that? We're holding the lock here.

Comments from Reviewable

@petermattis
Copy link
Collaborator

Review status: 0 of 2 files reviewed at latest revision, 4 unresolved discussions, all commit checks successful.


storage/raft_transport.go, line 76 [r2] (raw file):

      sync.Mutex
      handlers map[roachpb.StoreID]raftMessageHandler
      queues   map[raftpb.MessageType]map[roachpb.NodeID]chan *RaftMessageRequest

Previously we were using one queue for MsgSnap and another queue for everything else. Now there is a queue per message type. Is that problematic? I don't know enough about raft. An alternative to this would be to model this as:

map[bool]map[roachpb.NodeID]chan *RaftMessageRequest

And then access it as t.mu.queues[req.Message.Type == raftpb.MsgSnap].


Comments from Reviewable

@tamird
Copy link
Contributor Author

tamird commented Jun 17, 2016

Review status: 0 of 2 files reviewed at latest revision, 4 unresolved discussions, all commit checks successful.


storage/raft_transport.go, line 76 [r2] (raw file):

Previously, petermattis (Peter Mattis) wrote…

Previously we were using one queue for MsgSnap and another queue for everything else. Now there is a queue per message type. Is that problematic? I don't know enough about raft. An alternative to this would be to model this as:

map[bool]map[roachpb.NodeID]chan *RaftMessageRequest

And then access it as t.mu.queues[req.Message.Type == raftpb.MsgSnap].

Yeah, that's an option. There are about 13-14 message types. Do you think it really matters?

Comments from Reviewable

@petermattis
Copy link
Collaborator

Review status: 0 of 2 files reviewed at latest revision, 4 unresolved discussions, all commit checks successful.


storage/raft_transport.go, line 76 [r2] (raw file):

Previously, tamird (Tamir Duberstein) wrote…

Yeah, that's an option. There are about 13-14 message types. Do you think it really matters?

I have no idea if it matters. My suggestion was motivated by trying to be more conservative with this change and only fixing the known problem of accessing the snapshot stream concurrently. One problem with using a queue per message type is that each queue will require goroutines on the reading side. Is that increase a problem? I don't know. Is the differing flow control (which could reorder messages) a problem? I don't know.

@bdarnell?


Comments from Reviewable

@cuongdo
Copy link
Contributor

cuongdo commented Jun 17, 2016

Review status: 0 of 2 files reviewed at latest revision, 3 unresolved discussions, all commit checks successful.


storage/raft_transport.go, line 222 [r2] (raw file):

Previously, tamird (Tamir Duberstein) wrote…

Why does it assume that? We're holding the lock here.

I was worried about the `ch<-req`below being called from different goroutines and streamed asynchronously, but I hadn't realized you'd made `processQueue` process each message synchronously within each queue.

Comments from Reviewable

@tamird
Copy link
Contributor Author

tamird commented Jun 17, 2016

Review status: 0 of 2 files reviewed at latest revision, 3 unresolved discussions.


storage/raft_transport.go, line 76 [r2] (raw file):

Previously, petermattis (Peter Mattis) wrote…

I have no idea if it matters. My suggestion was motivated by trying to be more conservative with this change and only fixing the known problem of accessing the snapshot stream concurrently. One problem with using a queue per message type is that each queue will require goroutines on the reading side. Is that increase a problem? I don't know. Is the differing flow control (which could reorder messages) a problem? I don't know.

@bdarnell?

Changed to the more conservative thing.

Comments from Reviewable

@petermattis
Copy link
Collaborator

:lgtm:


Review status: 0 of 2 files reviewed at latest revision, 3 unresolved discussions, some commit checks failed.


Comments from Reviewable

@cuongdo
Copy link
Contributor

cuongdo commented Jun 17, 2016

:lgtm:


Review status: 0 of 2 files reviewed at latest revision, 2 unresolved discussions, some commit checks failed.


Comments from Reviewable

@bdarnell
Copy link
Contributor

:lgtm:


Review status: 0 of 2 files reviewed at latest revision, 3 unresolved discussions, all commit checks successful.


storage/raft_transport.go, line 76 [r2] (raw file):

Previously, tamird (Tamir Duberstein) wrote…

Changed to the more conservative thing.

Yeah, raft performance suffers if messages get reordered, so it's better to use one queue for all non-snapshot messages. (I'm not sure if there are issues with cross-type reordering but we've definitely seen it with MsgApp getting reordered)

Needs a comment about what true and false mean, and why we're doing this in the first place (all the comments about snapshots not blocking other messages have been lost).


storage/raft_transport_test.go, line 119 [r4] (raw file):

      transports[nodeID] = storage.NewRaftTransport(storage.GossipAddressResolver(g), grpcServer, nodeRPCContext)
      // This channel is nominally unbuffered, but it is also normally serviced

s/nominally/normally/


Comments from Reviewable

@tamird
Copy link
Contributor Author

tamird commented Jun 17, 2016

Review status: 0 of 2 files reviewed at latest revision, 3 unresolved discussions, some commit checks pending.


storage/raft_transport.go, line 76 [r2] (raw file):

Previously, bdarnell (Ben Darnell) wrote…

Yeah, raft performance suffers if messages get reordered, so it's better to use one queue for all non-snapshot messages. (I'm not sure if there are issues with cross-type reordering but we've definitely seen it with MsgApp getting reordered)

Needs a comment about what true and false mean, and why we're doing this in the first place (all the comments about snapshots not blocking other messages have been lost).

Added a comment at the use-site.

storage/raft_transport_test.go, line 119 [r4] (raw file):

Previously, bdarnell (Ben Darnell) wrote…

s/nominally/normally/

Done.

Comments from Reviewable

@tamird tamird merged commit 7bf5d50 into cockroachdb:master Jun 17, 2016
@tamird tamird deleted the raft-transport-two-queues branch June 17, 2016 16:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants