-
Notifications
You must be signed in to change notification settings - Fork 9.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
raft: allow voter to become learner through snapshot #10864
Conversation
Codecov Report
@@ Coverage Diff @@
## master #10864 +/- ##
==========================================
+ Coverage 62.82% 63.06% +0.23%
==========================================
Files 398 398
Lines 37446 37441 -5
==========================================
+ Hits 23526 23611 +85
+ Misses 12339 12252 -87
+ Partials 1581 1578 -3
Continue to review full report at Codecov.
|
One concern that I could see is that turning the leader from a voter into a learner has to be handled appropriately. This is similar to removing the leader completely, except it's worse if we get it wrong. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes sense to me, but I'd like to hear from @xiang90 or @siddontang why they recommended testing for this in the original PR.
We just do not have this use case (demote) at that time... I think it is a safe operation otherwise. /cc @siddontang any thoughts? |
In case this wasn't clear, I believe that this change fixes a real bug when using learners -- some peers may end up never being able to accept snapshots again. I'm going to go ahead and merge this (pretty confident it's correct at this point) but I'd be happy to hear from @siddontang just on the off chance that I'm missing something. |
At the time of writing, we don't allow configuration changes to change voters to learners directly, but note that a snapshot may compress multiple changes to the configuration into one: the voter could have been removed, then readded as a learner and the snapshot reflects both changes. In that case, a voter receives a snapshot telling it that it is now a learner. In fact, the node has to accept that snapshot, or it is permanently cut off from the Raft log. I think this just wasn't realized in the original work, but this is just my guess since there generally is very little rationale on the various decisions made. I also generally haven't been able to figure out whether the decision to prevent voters from becoming learners without first having been removed was motivated by some particular concern, or if it just wasn't deemed necessary. I suspect it is the latter because demoting a voter seems perfectly safe. See etcd-io#8751 (comment).
We have an assumption that when a node is removed from the group, it will never be added to the group again. If a peer is readded on the same node, it will be assigned a different peer ID. I have not thought through it yet whether a removed peer can be readded without a new ID but it certainly needs to pay attention to campaign because a removed peer can vote now. |
No, votes from removed peers are not taken into account. You can see this here, we iterate over the config to fish the votes. If there are any votes from nodes not in the active config, they are ignored. Generally all of the new quorum code is taking the configuration into account properly.
That's an assumption that's reasonable to make in an app (CockroachDB works the same way, we use tombstones to avoid accidentally recreating a replica that has had its state deleted already) but it's not one intrinsic to raft. For example, it can make a lot of sense to promote/demote voters repeatedly and it could lead to these snapshots. |
To pick up etcd-io/etcd#10864. Release note: None
To pick up etcd-io/etcd#10864. Release note: None
To pick up etcd-io/etcd#10864. This commit also resolves the following, which have happened since the last version of etcd/raft: - The raft.ProgressState enum and its variants now live in raft/tracker. - The Paused field is no longer present on raft.Progress. - RawNode.Status no longer returns a pointer. - Some of the fields on raft.Status are now on an embedded raft.BasicStatus struct. Release note: None
38904: parser: Add user defined column families to CTAS. r=adityamaru27 a=adityamaru27 This change introduces grammar to support user defined column families in a CREATE TABLE ... AS query. 38913: deps: bump go.etcd.io/etcd r=tbg a=danhhz To pick up etcd-io/etcd#10864. Release note: None 38972: exec: fix calculation of selectivity stat when num batches is zero r=yuzefovich a=yuzefovich Release note: None Co-authored-by: Aditya Maru <[email protected]> Co-authored-by: Tobias Schottdorf <[email protected]> Co-authored-by: Daniel Harrison <[email protected]> Co-authored-by: Yahor Yuzefovich <[email protected]>
At the time of writing, we don't allow configuration changes to change
voters to learners directly, but note that a snapshot may compress
multiple changes to the configuration into one: the voter could have
been removed, then readded as a learner and the snapshot reflects both
changes. In that case, a voter receives a snapshot telling it that it is
now a learner. In fact, the node has to accept that snapshot, or it is
permanently cut off from the Raft log.
I think this just wasn't realized in the original work, but this is just
my guess since there generally is very little rationale on the various
decisions made. I also generally haven't been able to figure out whether
the decision to prevent voters from becoming learners without first
having been removed was motivated by some particular concern, or if it
just wasn't deemed necessary. I suspect it is the latter because
demoting a voter seems perfectly safe.
See #8751 (comment).
cc @xiang90