-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
raft: forget the leader when stepping down if it's safe #134990
base: master
Are you sure you want to change the base?
Conversation
d1ae323
to
d2fc71e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @arulajmani and @nvanbenschoten)
pkg/raft/raft.go
line 1274 at r6 (raw file):
r.tick = r.tickElection if lead == None { r.resetLead()
The only downside I see here is that defortify() now wouldn't advance electionElapsed because we are already defortified.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 4 of 4 files at r1, 1 of 1 files at r2, 3 of 3 files at r3, 3 of 3 files at r4, 3 of 3 files at r5, 2 of 2 files at r6, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @arulajmani and @iskettaneh)
pkg/raft/raft.go
line 1257 at r2 (raw file):
// Once this state is stabilized (within and above pkg/raft), we can remove // this special case. r.lead = None
We can remove this manual r.lead
assignment now.
pkg/raft/raft.go
line 1263 at r2 (raw file):
r.reset(term) r.tick = r.tickElection if lead == None {
This second commit is essentially a reversion of b1053d4. Should we just undo that commit?
pkg/raft/raft.go
line 2714 at r5 (raw file):
// and propose that change. Therefore, it should be safe at this point to // forget that we were the leader at this term and step down. r.becomeFollower(r.Term, None)
Should we make the same change in the VoteLost
case of stepCandidate
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @arulajmani, @iskettaneh, and @nvanbenschoten)
pkg/raft/raft.go
line 1263 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
This second commit is essentially a reversion of b1053d4. Should we just undo that commit?
The second commit kinda assumed that if you explicitly step down with lead==None
, you want to DeFortify (setting LeadEpoch to 0). This helps avoid assertions where we know that it's safe to step down while forgetting the leader (for example due to becoming a learner, or due to CheckQuorum). In those cases, we might be in a situation where r.lead == 0 but r.leadEpoch != 0.
Shall we explicitly call DeFortify() before stepping down in those cases?
d2fc71e
to
d331b91
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @arulajmani and @nvanbenschoten)
pkg/raft/raft.go
line 1257 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
We can remove this manual
r.lead
assignment now.
I left a TODO to call r.resetLead() inside r.reset() once we verified that it's safe to do so (all places that stepdown to the same term can defortify at that point). Once we do that, we canm remove this manual assignment
pkg/raft/raft.go
line 1263 at r2 (raw file):
Previously, iskettaneh wrote…
The second commit kinda assumed that if you explicitly step down with
lead==None
, you want to DeFortify (setting LeadEpoch to 0). This helps avoid assertions where we know that it's safe to step down while forgetting the leader (for example due to becoming a learner, or due to CheckQuorum). In those cases, we might be in a situation where r.lead == 0 but r.leadEpoch != 0.Shall we explicitly call DeFortify() before stepping down in those cases?
I left a TODO to call r.resetLead() inside r.reset() once we verified that it's safe to do so (all places that stepdown to the same term can defortify at that point). What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @arulajmani and @nvanbenschoten)
pkg/raft/raft.go
line 1257 at r2 (raw file):
Previously, iskettaneh wrote…
I left a TODO to call r.resetLead() inside r.reset() once we verified that it's safe to do so (all places that stepdown to the same term can defortify at that point). Once we do that, we canm remove this manual assignment
Nvm, it's safe to remove it now since I am calling resetLead() if lead==none, and we already know that leadEpoch==0 inside that if-statement. I removed it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Intermediate comments. Flushing them so you can address them while you make the changes we discussed synchronously.
Reviewed 4 of 4 files at r1, 8 of 8 files at r7, 1 of 1 files at r8, 3 of 3 files at r9, 3 of 3 files at r10.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @iskettaneh and @nvanbenschoten)
pkg/raft/raft.go
line 1267 at r1 (raw file):
r.logger.Infof("%x became follower at term %d", r.id, r.Term) // If we should broadcast a DeFortify message, we will do it in the next
This comment doesn't seem great -- it's written in the context of the code that existed previously, not what the code does/will do after this patch merges. I would reword this as:
Start de-fortifying eagerly so we don't have to wait out a full heartbeat timeout before sending the first de-fortification message.
d331b91
to
3bbe72f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @arulajmani and @nvanbenschoten)
pkg/raft/raft.go
line 1267 at r1 (raw file):
Previously, arulajmani (Arul Ajmani) wrote…
This comment doesn't seem great -- it's written in the context of the code that existed previously, not what the code does/will do after this patch merges. I would reword this as:
Start de-fortifying eagerly so we don't have to wait out a full heartbeat timeout before sending the first de-fortification message.
Done.
3bbe72f
to
6f7d54a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly LGTM
Reviewed 16 of 16 files at r21, 4 of 4 files at r22, 1 of 1 files at r23, 2 of 2 files at r24, 4 of 4 files at r25, 3 of 3 files at r26, 2 of 2 files at r27, 1 of 1 files at r28, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @iskettaneh and @nvanbenschoten)
pkg/raft/raft.go
line 1276 at r22 (raw file):
r.reset(term) r.setLead(lead) r.logger.Infof("%x became follower at term %d", r.id, r.Term)
nit: consider adding a r.logger.Debugf("%x reset election elapsed to %d", r.id, r.electionElapsed)
just so we're able to see that the election elapsed is reset back to 0 after the call to de-fortify in the data driven tests?
pkg/raft/raft.go
line 491 at r27 (raw file):
if r.lead == r.id { // If we were the leader, we must have waited out the leaderMaxSupported
nit: leadMaxSupported.
Seperately, could you add words for how we know that we've waited out the leadMaxSupported at this point? Also, worth adding words about why we forget the leader when becoming a follower (even though it is safe, it's unclear to a reader why we're doing so).
pkg/raft/raft.go
line 498 at r27 (raw file):
} else { // If we weren't the leader, we should NOT forget who the leader is to avoid // regressing the leaderMaxSupported.
nit: leadMaxSupported.
Separately, can you add words on how the leadMaxSupported could regress if we were to forget the leader?
Right now, we start bcasting defortificaton messages at heartbeat timeouts. However, there is nothing that prevents us from starting this process even earlier. This commit starts this process when stepping down immediately after becoming a follower. Epic: None Release notes: None
This commit allows us to forget the leader when stepping down. We can't just call `r.resetLead()` inside `r.reset()` because it would defortify in all places where we stepdown to the same term. Once we make it safe in all places to stepdown to the same term and forget the leader, we can call r.resetLead() inside r.reset(). Epic: None Release note: None
In checkQuorum, if we decide to step down, it means that a majority of replicas don't support us. Therefore, it's safe to step down and forget that we were the leader at this term. We can safely campaign and participate in elections. Epic: None Release note: None
Since leadership transfer means that the current leader instructs another replica to campaign without performing safety checks. It should be safe to forget the leader when stepping down. Epic: None Release note: None
When a current leader becomes a learner, it should be safe to step down and forget that we were the leader at the current term because: (1) learners can't campaign or participate in elections, and (2) in order for a learner to become a voter again, a new leader needs to get elected and propose that change, which means that the replica would be able to safely participate in elections then. Epic: None Release note: None
This commit makes us forget that we were the leader at this term after a restart. Epic: None Release notes: None
We already forget the leader when attempting to campaign. This commit doesn't change any current behaviour. Epic: None Release notes: None
6f7d54a
to
1468e03
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 35 of 35 files at r29, 1 of 1 files at r30, 2 of 2 files at r31, 4 of 4 files at r32, 3 of 3 files at r33, 2 of 2 files at r34, 1 of 1 files at r35, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @iskettaneh and @nvanbenschoten)
pkg/raft/raft.go
line 501 at r34 (raw file):
} else { // If we weren't the leader, we should NOT forget who the leader is to avoid // regressing the leadMaxSupported. We can't just forget the leader because
Something about this looks off. I'd say something like:
// If we weren't the leader, we should NOT forget who the leader is to avoid
// regressing the leadMaxSupported. We can't just forget the leader because
// we might have been fortifying a leader before the restart and need to
// uphold our fortification promise after the restart as well. To do so, we need
// to remember the leader and the fortified epoch, otherwise we may vote for
// a different candidate or prematurely call an election, either of which could
// regress the LeadSupportUntil.
First commit is based on #136025
This PR makes the leader forget that it was the leader when stepping down if it's safe to do so. The places this PR changes:
Epic: none
Release note: None