Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework ID usage in multiraft #2675

Merged
merged 6 commits into from
Sep 29, 2015
Merged

Conversation

bdarnell
Copy link
Contributor

See docs/RFCS/replica_tombstone.md for motivation and background.

Raft requires that node IDs never be reused; coalesced heartbeats
require that node IDs are consistent across all groups. We must
therefore use different IDs for the two concepts. The never-reused raft
node ID is known as a ReplicaID in cockroach; our node ID and store ID
are used in multiraft for message routing but never make it down to the
raft level.

The mapping from (GroupID, ReplicaID) to (NodeID, StoreID) never changes
once set, so multiraft maintains a cache and there is no need to worry
about invalidations. This cache is populated by three sources: the range
descriptors, incoming snapshots (both loaded from the application via
new methods in the multiraft.Storage interface), and the to/from
addresses of all non-heartbeat raft messages.

The important changes are in the first commit; the other two are just renamings.

@tamird tamird mentioned this pull request Sep 27, 2015
@bdarnell bdarnell force-pushed the use-replica-id branch 2 times, most recently from 34638e7 to 9ef2c84 Compare September 28, 2015 02:57
@tamird
Copy link
Contributor

tamird commented Sep 28, 2015

One general comment: is there a test that demonstrates what this change enables? if not, is it feasible to add one?

@bdarnell
Copy link
Contributor Author

This PR is supposed to be a functionally-equivalent refactoring; the testable change will be in the next PR (which will introduce tombstones as described in the RFC and we'll be able to test various scenarios around replica removal and resurrection).

@tamird
Copy link
Contributor

tamird commented Sep 28, 2015

OK, LGTM

@tbg
Copy link
Member

tbg commented Sep 28, 2015

LGTM. Some tests where NodeID != StoreID != ReplicaID would be nice, but that's ok in the next PR.

@bdarnell
Copy link
Contributor Author

FYI our existing tests in client_raft_test.go do create situations where ReplicaID is different from the node and store IDs.

@mrtracy
Copy link
Contributor

mrtracy commented Sep 28, 2015

LGTM

@bdarnell
Copy link
Contributor Author

This change appears to have altered the timing enough that it has exposed flakiness in some tests. I'm running the tests repeatedly to try and flush these out before merging; my apologies to anyone else in the CI queue today.

See docs/RFCS/replica_tombstone.md for motivation and background.

Raft requires that node IDs never be reused; coalesced heartbeats
require that node IDs are consistent across all groups. We must
therefore use different IDs for the two concepts. The never-reused raft
node ID is known as a ReplicaID in cockroach; our node ID and store ID
are used in multiraft for message routing but never make it down to the
raft level.

The mapping from (GroupID, ReplicaID) to (NodeID, StoreID) never changes
once set, so multiraft maintains a cache and there is no need to worry
about invalidations. This cache is populated by three sources: the range
descriptors, incoming snapshots (both loaded from the application via
new methods in the multiraft.Storage interface), and the to/from
addresses of all non-heartbeat raft messages.
The renaming of storage.Range to storage.Replica made the proto.Replica
name confusing.
Also change a couple of printfs that `go vet` is suddenly unhappy with
(it's not recognizing the `String()` implementations on these specific
variables, even though it works for other vars of the same types).
bdarnell added a commit that referenced this pull request Sep 29, 2015
@bdarnell bdarnell merged commit fd5b64a into cockroachdb:master Sep 29, 2015
@bdarnell bdarnell deleted the use-replica-id branch September 29, 2015 02:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants