Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kvserver: maybe forget leader on (Pre)Vote requests #105132

Merged
merged 1 commit into from
Jun 27, 2023

Conversation

erikgrinaker
Copy link
Contributor

@erikgrinaker erikgrinaker commented Jun 19, 2023

This patch will forget the Raft leader when receiving a (Pre)Vote request and finding the leader to be dead (according to liveness) or removed. This allows a quorum of replicas to campaign despite the PreVote+CheckQuorum condition if they independently consider the leader to be dead.

A more targeted alternative was explored, where we forget the leader only when unquiescing to a dead leader. However, this had a number of drawbacks. It would have to use TransferLeader to steal leadership away from a Raft leader who can't heartbeat liveness during lease acquisitions, but this could cause unavailability due to races where multiple replicas request leader transfers and some are not up-to-date on the log. It would also need mixed-version logic since 23.1 nodes could otherwise be unable to obtain necessary prevotes.

We instead choose to be lenient for now, and we can consider tightening the conditions later when we're more confident in the PreVote+CheckQuorum handling and may no longer need epoch leases.

Touches #92088.

Epic: none
Release note: None

@erikgrinaker erikgrinaker requested review from pav-kv and tbg June 19, 2023 08:40
@erikgrinaker erikgrinaker self-assigned this Jun 19, 2023
@erikgrinaker erikgrinaker requested a review from a team June 19, 2023 08:40
@erikgrinaker erikgrinaker requested review from a team as code owners June 19, 2023 08:40
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@erikgrinaker erikgrinaker force-pushed the raft-forget-leader-vote-reqs branch 2 times, most recently from c70e049 to 9d0c7d8 Compare June 26, 2023 08:45
craig bot pushed a commit that referenced this pull request Jun 27, 2023
105126: kvserver: don't campaign when unquiescing for a Raft message r=erikgrinaker a=erikgrinaker

We're going to generalize this to all (pre)votes, see #105132. However, I'm merging this as-is to keep a record of the alternative, since we may want to change to this approach later, when e.g. 23.1 compatibility and epoch leases is less of a concern.

---

**DEPS: upgrade Raft to a10cd45**

Adds `ForgetLeader()`, which we need for CheckQuorum handling.

```
a10cd45 2023-06-26: Merge pull request 78 from erikgrinaker/forget-leader [Benjamin Wang]
1159466 2023-06-16: add `ForgetLeader` [Erik Grinaker]
09ea4c5 2023-06-16: rafttest: show term and leader for `raft-state` [Erik Grinaker]
26ce926 2023-06-20: rafttest: add `log-level` argument for `stabilize` [Erik Grinaker]
30e2fa4 2023-06-12: Merge pull request 70 from erikgrinaker/prevote-checkquorum-datadriven [Benjamin Wang]
a042ce3 2023-06-09: Merge pull request 75 from charles-chenzz/update-go-patch [Benjamin Wang]
dd2340f 2023-06-08: update go to patch release 1.19.10 [charles-chenzz]
27dd2c2 2023-06-02: add data-driven tests for PreVote and CheckQuorum [Erik Grinaker]
```

**kvserver: tweak updateLivenessMap comment**

**kvserver: add `Replica.forgetLeaderLocked()`**

**kvserver: don't campaign when unquiescing for a Raft message**

This patch forgets the leader when unquiescing in response to a Raft message and finding a dead leader (according to liveness). We don't campaign, because that could result in election ties, but forgetting the leader allows us to grant (pre)votes even though we've heard from the leader recently, avoiding having to wait out an election timeout.

Touches #92088.

Epic: none
Release note: None


Co-authored-by: Erik Grinaker <[email protected]>
@erikgrinaker erikgrinaker force-pushed the raft-forget-leader-vote-reqs branch from 9d0c7d8 to 7bb8852 Compare June 27, 2023 09:27
This patch will forget the Raft leader when receiving a (Pre)Vote
request and finding the leader to be dead (according to liveness) or
removed. This allows a quorum of replicas to campaign despite the
PreVote+CheckQuorum condition if they independently consider the leader
to be dead.

A more targeted alternative was explored, where we forget the leader
only when unquiescing to a dead leader. However, this had a number of
drawbacks. It would have to use `TransferLeader` to steal leadership
away from a Raft leader who can't heartbeat liveness during lease
acquisitions, but this could cause unavailability due to races where
multiple replicas request leader transfers and some are not up-to-date
on the log. It would also need mixed-version logic since 23.1 nodes
could otherwise be unable to obtain necessary prevotes.

We instead choose to be lenient for now, and we can consider tightening
the conditions later when we're more confident in the
PreVote+CheckQuorum handling and may no longer need epoch leases.

Epic: none
Release note: None
@erikgrinaker
Copy link
Contributor Author

bors r+

@erikgrinaker erikgrinaker force-pushed the raft-forget-leader-vote-reqs branch from 7bb8852 to ec11960 Compare June 27, 2023 10:13
@craig
Copy link
Contributor

craig bot commented Jun 27, 2023

Canceled.

@erikgrinaker
Copy link
Contributor Author

bors r+

@craig
Copy link
Contributor

craig bot commented Jun 27, 2023

Build succeeded:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants