release-22.2: kvserver: fail stale ConfChange when rejected by raft #106150

tbg · 2023-07-05T10:52:43Z

Backport 3/3 commits from #106104.

/cc @cockroachdb/release

Release justification: fixes a bug that can result in replica unavailability.

Because etcd/raft activates configuration changes when they are applied, but checks new proposed configs before they are considered for adding to the log (or forwarding to the leader), the following can happen:

conf change 1 gets evaluated on a leaseholder n1
lease changes
new leaseholder evaluates and commits conf change 2
n1 receives and applies conf change 2
conf change 1 gets added to the proposal buffer and flushed; RawNode rejects
it because conf change 1 is not compatible on top of conf change 2

Prior to this commit, because raft silently replaces the conf change with an
empty entry, we would never see the proposal below raft (where it would be
rejected due to the lease change). In effect, this caused replica unavailability
because the proposal and the associated latch would stick around forever, and the
replica circuit breaker would trip.

This commit provides a targeted fix: when the proposal buffer flushes a conf
change to raft, we check if it got replaced with an empty entry. If so, we
properly finish the proposal. To be conservative, we signal it with an ambiguous
result: it seems conceivable that the rejection would only occur on a
reproposal, while the original proposal made it into raft before the lease
change, and the local replica is in fact behind on conf changes rather than
ahead (which can happen if it's a follower). The only "customer" here is the
replicate queue (and scatter, etc) so this is acceptable; any choice here would
necessarily be a "hard error" anyway.

Epic: CRDB-25287
Release note (bug fix): under rare circumstances, a replication change could get
stuck when proposed near lease/leadership changes (and likely under overload),
and the replica circuit breakers could trip. This problem has been addressed.

Fixes #105797.
Closes #104709.

Replace a code snippet with a drop-in solution that exists now.

ProposeConfChange just calls MarshalConfChange followed by Step, and by inlining this in our codebase we gain the ability to figure out if raft rejected our ConfChange. We will need this in a follow-up commit.

Because etcd/raft activates configuration changes when they are applied, but checks new proposed configs before they are considered for adding to the log (or forwarding to the leader), the following can happen: - conf change 1 gets evaluated on a leaseholder n1 - lease changes - new leaseholder evaluates and commits conf change 2 - n1 receives and applies conf change 2 - conf change 1 gets added to the proposal buffer and flushed; RawNode rejects it because conf change 1 is not compatible on top of conf change 2 Prior to this commit, because raft silently replaces the conf change with an empty entry, we would never see the proposal below raft (where it would be rejected due to the lease change). In effect, this caused replica unavailability because the proposal and the associated latch would stick around forever, and the replica circuit breaker would trip. This commit provides a targeted fix: when the proposal buffer flushes a conf change to raft, we check if it got replaced with an empty entry. If so, we properly finish the proposal. To be conservative, we signal it with an ambiguous result: it seems conceivable that the rejection would only occur on a reproposal, while the original proposal made it into raft before the lease change, and the local replica is in fact behind on conf changes rather than ahead (which can happen if it's a follower). The only "customer" here is the replicate queue (and scatter, etc) so this is acceptable; any choice here would necessarily be a "hard error" anyway. Epic: CRDB-25287 Release note (bug fix): under rare circumstances, a replication change could get stuck when proposed near lease/leadership changes (and likely under overload), and the replica circuit breakers could trip. This problem has been addressed. Fixes cockroachdb#105797. Closes cockroachdb#104709.

blathers-crl · 2023-07-05T10:52:46Z

cockroach-teamcity · 2023-07-05T10:52:54Z

This change is

tbg · 2023-07-06T13:38:38Z

#106267

tbg added 3 commits July 5, 2023 12:51

kvserver: simplify proposal encoding in tests

5413de5

Replace a code snippet with a drop-in solution that exists now.

kvserver: use Step for conf changes as well

ca10bd8

ProposeConfChange just calls MarshalConfChange followed by Step, and by inlining this in our codebase we gain the ability to figure out if raft rejected our ConfChange. We will need this in a follow-up commit.

tbg requested a review from a team as a code owner July 5, 2023 10:52

tbg requested a review from erikgrinaker July 5, 2023 11:20

erikgrinaker approved these changes Jul 5, 2023

View reviewed changes

tbg closed this Jul 6, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

release-22.2: kvserver: fail stale ConfChange when rejected by raft #106150

release-22.2: kvserver: fail stale ConfChange when rejected by raft #106150

tbg commented Jul 5, 2023 •

edited

Loading

blathers-crl bot commented Jul 5, 2023

cockroach-teamcity commented Jul 5, 2023

tbg commented Jul 6, 2023

release-22.2: kvserver: fail stale ConfChange when rejected by raft #106150

release-22.2: kvserver: fail stale ConfChange when rejected by raft #106150

Conversation

tbg commented Jul 5, 2023 • edited Loading

blathers-crl bot commented Jul 5, 2023

cockroach-teamcity commented Jul 5, 2023

tbg commented Jul 6, 2023

tbg commented Jul 5, 2023 •

edited

Loading