-
Notifications
You must be signed in to change notification settings - Fork 108
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Missing to set the votedFor variable for the Candidate/Leader may lead to multiple leaders in the same term. #44
Comments
In the series of events you imagine, after node 1 becomes the leader, it'll soon add and replicate a special log entry for its term (i.e. 1) according to Raft protocol, leading to node 3 not being able to be voted by node 1 and node 2. If you are saying node 1 crashes immediately after node 1 receives the vote from node 2, node 2 will know or not know node 1 is the leader. Since AppendEntries message serves both log replication and election, if node 2 knows, it must have the special log entry for term 1. To sum up, node 3 won't be leader, unless node 1 doesn't successfully add the special log entry in the election. |
Thank you for your reply. Yes, the no-op log entry can prevent safety issues in the Raft system. However, there could be a time interval between when Node 1 becomes the leader and when the other nodes receive the no-op logs, which can lead to the scenarios described above. Let's take a look at the relevant code in changeToRole(new LeaderNodeRole(role.getTerm(), scheduleLogReplicationTask()));
context.log().appendEntry(role.getTerm()); // no-op log
context.connector().resetChannels(); // close all inbound channels In the first line, Node 1 sets its role to Leader (where it indeed becomes the leader) and schedules a log replication task. In the second line, Node 1 adds the no-op log entry to its log sequence. In Now, there are two cases. If the addition of the no-op log finishes before the first From a straightforward perspective, maybe we can fix this issue by simply changing the order of the first and second lines in the code above. However, I recommend persisting the It is true it may not be an very important issue since there is no log inconsistence is this issue. However, consider there is a user who first sees node 1 become the leader in term 1 and then sees node 3 become the leader in term 1. He may be confused by this system behavior. Best wishes. |
XRaft runs in a single-threaded architecture, which means the logs being replicated to other nodes always contain the no-op log entry no matter the order of lines of code. Also, in-memory logs are just demo purpose, it doesn't meet the persistence requirements of Raft, so I don't think we should discuss cases based on in-memory logs. The nature of your question is whether an unsaved implicit vote in a candidate(or leader, in either case, vote for itself) could cause issues in the election, e.g. multiple leaders, unstable election, etc. I don't have time to make a proof, perhaps you can make one, or just request a change to save the self-vote. I don't see any problem in doing that. |
Missing to set the votedFor variable for the Candidate/Leader may lead to multiple leaders in the same term.
In the original Raft protocol, when a node timeout,s it sets
votedFor
toNil
. After that, it sends aRequestVote
to all nodes in the cluster (including itself). When this node receives its ownRequestVote
request, it setsvotedFor
to itself. Once the node receives votes from a majority of nodes, it updates its states toLeader
.In XRaft, however, when a node timeouts, it transitions to a Candidate node. At this point, it sets its own
votesCounted
to 1 and no longer sends aRequestVote
request to itself.However, there is an issue in the current XRaft implementation. Neither the Leader class nor Candidate class in XRaft includes a
votedFor
field, so the node which timouts does not record itself as thevotedFor
value. Therefore, even if it will persist thevotedFor
value to disk when transitioning to Candidate or Leader state, the persisted value is indeelNull
. If a node restarts after becoming a Leader, it may vote for other nodes again because it did not retrieve anyvotedFor
information from the disk.Consider the following fault scenario in a 3-node cluster, where initially all nodes have
term = 0
:term = 1
andvotesCount = 1
.term
to 1, and votes for Node 1. Upon receiving Node 2’s vote, Node 1 becomes the Leader.term
andvotedFor
via thechangeToRole
method. However, because neither the Candidate class nor the Leader class modifiesvotedFor
, the persisted value ofvotedFor
remainsNull
.votedFor
as persistedNull
.term = 1
, andvotesCount = 1
.votedFor == Null
, it votes for Node 3, making Node 3 become the Leader for term 1.This scenario results in both Node 1 and Node 3 acting as Leaders in the same term, which does not comply with Raft's protocol.
Please check if this is indeed a bug, and if so, I would be happy to submit a PR to fix it.
The text was updated successfully, but these errors were encountered: