storage: add to perf comment in handleRaftReady #38380

tbg · 2019-06-24T20:35:07Z

I pulled on a thread in etcd/raft today where I was worried that
something was off with our perf improvements around sending MsgApp
before persisting HardState and Entries. Everything ended up being
OK but I updated a comment.

Release note: None

cockroach-teamcity · 2019-06-24T20:35:57Z

This change is

nvanbenschoten

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten and @tbg)

pkg/storage/replica_raft.go, line 653 at r1 (raw file):

	// etcd/raft does not support commit quorums that do not include the leader,
	// even though the Raft thesis states that this would technically be safe:
	// > The leader may even commit an entry before it has been written to its

If we're going to change the indentation for quotes, do it here too.

pkg/storage/replica_raft.go, line 663 at r1 (raw file):

or for which a corresponding Entry is not previously persisted

I'm not sure I understand this case. When would a leader be sending out a commit index above the index of entries which it has not previously persisted? The paragraph before this one presents an argument for why this would never happen. Am I missing something? Perhaps around leadership changes?

Or are you talking more abstractly here and then saying that this isn't actually a problem given the implementation. If so, I'd soften the language around "the committed index may require persisting Entries from the current Ready". That's only true in the single-node Raft group, right? In which case, there are no messages.

pkg/storage/replica_raft.go, line 716 at r1 (raw file):

		// committed index larger than the last index, which will set off
		// alarms. Right now everything is in the same batch and so that problem
		// does not exist, but should that change we can lower Committed before

s/can lower/can manually lower/ to make it clear that this is an action we'd have to take in this code path. Also maybe talk about why it's ok to do so.

Also, wouldn't appending the entries before writing the HardState be another solution to this if we no longer had atomic batch semantics for the writes? Or is that not true given some of the changes you're making at the moment?

I pulled on a thread in etcd/raft today where I was worried that something was off with our perf improvements around sending MsgApp before persisting HardState and Entries. Everything ended up being OK but I updated a comment. Release note: None

tbg

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten)

pkg/storage/replica_raft.go, line 663 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

or for which a corresponding Entry is not previously persisted
I'm not sure I understand this case. When would a leader be sending out a commit index above the index of entries which it has not previously persisted? The paragraph before this one presents an argument for why this would never happen. Am I missing something? Perhaps around leadership changes?

Or are you talking more abstractly here and then saying that this isn't actually a problem given the implementation. If so, I'd soften the language around "the committed index may require persisting Entries from the current Ready". That's only true in the single-node Raft group, right? In which case, there are no messages.

I think the answers to your questions were all in the comment, but I agree that it was hard to parse. I took a stab at reworking this whole block more holistically, PTAL

pkg/storage/replica_raft.go, line 716 at r1 (raw file):

Also, wouldn't appending the entries before writing the HardState be another solution to this if we no longer had atomic batch semantics for the writes? Or is that not true given some of the changes you're making at the moment?

Yes, see later in the same sentence

(or take care to persist the HardState after appending)

However, when we make the changes for etcd-io/etcd#7625 (comment) we want the opposite: we require that certain appends won't be persisted without updating HardState.Commit first. Even worse, when you're a follower catching up on a whole chunk of historical log (that may contain config changes), you'll get commit indexes pointing at the end of each chunk of entries, and you must persist that together with the chunk (not before because otherwise referring to unknown entries, not after because otherwise violate joint quorum invariant). Hmm, we might have to just lower the committed index to lastIndex if necessary when starting up a group. Anyway, certainly something to keep looking at as perf work happens on the Raft ready loop.

I'm also planning to copy this commentary in this file upstream so that by the time all of these questions become real there's documentation on the allowed concurrency patterns in etcd/raft itself.

nvanbenschoten

Reviewable status: complete! 1 of 0 LGTMs obtained

tbg · 2019-06-25T22:11:28Z

bors r=nvanbenschoten

38380: storage: add to perf comment in handleRaftReady r=nvanbenschoten a=tbg I pulled on a thread in etcd/raft today where I was worried that something was off with our perf improvements around sending MsgApp before persisting HardState and Entries. Everything ended up being OK but I updated a comment. Release note: None Co-authored-by: Tobias Schottdorf <[email protected]>

craig · 2019-06-25T23:20:39Z

Build failed

GitHub CI (Cockroach)

tbg · 2019-06-26T02:57:14Z

In the above flake (unrelated to this PR), acceptance/version-upgrade failed on

cockroach/pkg/cmd/roachtest/upgrade.go

Line 431 in 7e2ceae

t.Fatalf("%d: %s", i, err)

and the harness never manages to quite shut down, eventually hitting the 10 minute timeout and then getting stuck until teamcity comes in for a mercy killing (artifacts lost).

@andreimatei are you touching this at all in #30977? This seems to be a bug in roachtest (on top of a bug in the test/crdb).

[  11] acceptance/version-upgrade: ??? (0s)
22:40:37 test.go:808: test failure: 	upgrade.go:429,retry.go:184,upgrade.go:423,upgrade.go:568,acceptance.go:90,test.go:1249: 2: dial tcp 127.0.0.1:26259: connect: connection refused
22:40:37 cluster.go:344: > /go/src/github.com/cockroachdb/cockroach/bin/roachprod monitor local --oneshot --ignore-empty-nodes
22:40:38 test.go:808: test failure: 	cluster.go:1033,context.go:122,cluster.go:1022,panic.go:406,test.go:783,test.go:773,upgrade.go:429,retry.go:184,upgrade.go:423,upgrade.go:568,acceptance.go:90,test.go:1249: dead node detection: /go/src/github.com/cockroachdb/cockroach/bin/roachprod monitor local --oneshot --ignore-empty-nodes: exit status 1 4: skipped
		2: dead
		1: 23187
		3: 23306
		Error:  2: dead
22:40:38 cluster.go:1102: running (fast) consistency checks on node 1
[  12] acceptance/version-upgrade: ??? (0s)
[  13] acceptance/version-upgrade: ??? (0s)
[  14] acceptance/version-upgrade: ??? (0s)
[  15] acceptance/version-upgrade: ??? (0s)
[  16] acceptance/version-upgrade: ??? (0s)
[  17] acceptance/version-upgrade: ??? (0s)
[  18] acceptance/version-upgrade: ??? (0s)
[  19] acceptance/version-upgrade: ??? (0s)
22:48:55 test.go:808: test failure: 	test.go:1235: test timed out (10m0s)
22:48:55 cluster.go:992: fetching debug zip
22:48:55 cluster.go:344: > /go/src/github.com/cockroachdb/cockroach/bin/roachprod ssh local:1 -- ./cockroach debug zip --url {pgurl:1} debug.zip
22:48:56 cluster.go:270: > /go/src/github.com/cockroachdb/cockroach/bin/roachprod get local:1 debug.zip artifacts/acceptance/version-upgrade/debug.zip
local: getting debug.zip artifacts/acceptance/version-upgrade/debug.zip

tbg · 2019-06-26T02:59:06Z

Filed #38428

bors r=nvanbenschoten

38380: storage: add to perf comment in handleRaftReady r=nvanbenschoten a=tbg I pulled on a thread in etcd/raft today where I was worried that something was off with our perf improvements around sending MsgApp before persisting HardState and Entries. Everything ended up being OK but I updated a comment. Release note: None Co-authored-by: Tobias Schottdorf <[email protected]>

craig · 2019-06-26T03:27:54Z

Build succeeded

GitHub CI (Cockroach)

andreimatei · 2019-06-26T23:35:46Z

@andreimatei are you touching this at all in #30977? This seems to be a bug in roachtest (on top of a bug in the test/crdb).

Hopefully... But I don't entirely follow. Do you understand what's going on?

tbg · 2019-07-12T08:15:10Z

No, let's see if it happens again.

tbg requested review from a team and nvanbenschoten June 24, 2019 20:35

nvanbenschoten reviewed Jun 24, 2019

View reviewed changes

tbg force-pushed the fix/handle-ready-comment branch from 73a5555 to 816bcc3 Compare June 25, 2019 08:07

tbg force-pushed the fix/handle-ready-comment branch from 816bcc3 to 0635f83 Compare June 25, 2019 08:08

tbg commented Jun 25, 2019

View reviewed changes

tbg mentioned this pull request Jun 25, 2019

raft: improve the availability related to member change etcd-io/etcd#7625

Closed

nvanbenschoten approved these changes Jun 25, 2019

View reviewed changes

tbg mentioned this pull request Jun 26, 2019

roachtest: acceptance/version-upgrade failed #38428

Closed

craig bot merged commit 0635f83 into cockroachdb:master Jun 26, 2019

tbg deleted the fix/handle-ready-comment branch July 12, 2019 08:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

storage: add to perf comment in handleRaftReady #38380

storage: add to perf comment in handleRaftReady #38380

tbg commented Jun 24, 2019

cockroach-teamcity commented Jun 24, 2019

nvanbenschoten left a comment

tbg left a comment

nvanbenschoten left a comment

tbg commented Jun 25, 2019

craig bot commented Jun 25, 2019

tbg commented Jun 26, 2019

tbg commented Jun 26, 2019

craig bot commented Jun 26, 2019

andreimatei commented Jun 26, 2019

tbg commented Jul 12, 2019

storage: add to perf comment in handleRaftReady #38380

storage: add to perf comment in handleRaftReady #38380

Conversation

tbg commented Jun 24, 2019

cockroach-teamcity commented Jun 24, 2019

nvanbenschoten left a comment

Choose a reason for hiding this comment

tbg left a comment

Choose a reason for hiding this comment

nvanbenschoten left a comment

Choose a reason for hiding this comment

tbg commented Jun 25, 2019

craig bot commented Jun 25, 2019

Build failed

tbg commented Jun 26, 2019

tbg commented Jun 26, 2019

craig bot commented Jun 26, 2019

Build succeeded

andreimatei commented Jun 26, 2019

tbg commented Jul 12, 2019