Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

release-22.2: kvserver: fail stale ConfChange when rejected by raft #106150

Closed
wants to merge 3 commits into from

Conversation

tbg
Copy link
Member

@tbg tbg commented Jul 5, 2023

Backport 3/3 commits from #106104.

/cc @cockroachdb/release

Release justification: fixes a bug that can result in replica unavailability.


Because etcd/raft activates configuration changes when they are applied, but checks new proposed configs before they are considered for adding to the log (or forwarding to the leader), the following can happen:

  • conf change 1 gets evaluated on a leaseholder n1
  • lease changes
  • new leaseholder evaluates and commits conf change 2
  • n1 receives and applies conf change 2
  • conf change 1 gets added to the proposal buffer and flushed; RawNode rejects
    it because conf change 1 is not compatible on top of conf change 2

Prior to this commit, because raft silently replaces the conf change with an
empty entry, we would never see the proposal below raft (where it would be
rejected due to the lease change). In effect, this caused replica unavailability
because the proposal and the associated latch would stick around forever, and the
replica circuit breaker would trip.

This commit provides a targeted fix: when the proposal buffer flushes a conf
change to raft, we check if it got replaced with an empty entry. If so, we
properly finish the proposal. To be conservative, we signal it with an ambiguous
result: it seems conceivable that the rejection would only occur on a
reproposal, while the original proposal made it into raft before the lease
change, and the local replica is in fact behind on conf changes rather than
ahead (which can happen if it's a follower). The only "customer" here is the
replicate queue (and scatter, etc) so this is acceptable; any choice here would
necessarily be a "hard error" anyway.

Epic: CRDB-25287
Release note (bug fix): under rare circumstances, a replication change could get
stuck when proposed near lease/leadership changes (and likely under overload),
and the replica circuit breakers could trip. This problem has been addressed.

Fixes #105797.
Closes #104709.

tbg added 3 commits July 5, 2023 12:51
Replace a code snippet with a drop-in solution that exists now.
ProposeConfChange just calls MarshalConfChange followed by Step, and
by inlining this in our codebase we gain the ability to figure out if
raft rejected our ConfChange. We will need this in a follow-up commit.
Because etcd/raft activates configuration changes when they are applied, but checks new proposed configs before they are considered for adding to the log (or forwarding to the leader), the following can happen:

- conf change 1 gets evaluated on a leaseholder n1
- lease changes
- new leaseholder evaluates and commits conf change 2
- n1 receives and applies conf change 2
- conf change 1 gets added to the proposal buffer and flushed; RawNode rejects
  it because conf change 1 is not compatible on top of conf change 2

Prior to this commit, because raft silently replaces the conf change with an
empty entry, we would never see the proposal below raft (where it would be
rejected due to the lease change). In effect, this caused replica unavailability
because the proposal and the associated latch would stick around forever, and the
replica circuit breaker would trip.

This commit provides a targeted fix: when the proposal buffer flushes a conf
change to raft, we check if it got replaced with an empty entry. If so, we
properly finish the proposal. To be conservative, we signal it with an ambiguous
result: it seems conceivable that the rejection would only occur on a
reproposal, while the original proposal made it into raft before the lease
change, and the local replica is in fact behind on conf changes rather than
ahead (which can happen if it's a follower). The only "customer" here is the
replicate queue (and scatter, etc) so this is acceptable; any choice here would
necessarily be a "hard error" anyway.

Epic: CRDB-25287
Release note (bug fix): under rare circumstances, a replication change could get
stuck when proposed near lease/leadership changes (and likely under overload),
and the replica circuit breakers could trip. This problem has been addressed.

Fixes cockroachdb#105797.
Closes cockroachdb#104709.
@tbg tbg requested a review from a team as a code owner July 5, 2023 10:52
@blathers-crl
Copy link

blathers-crl bot commented Jul 5, 2023

Thanks for opening a backport.

Please check the backport criteria before merging:

  • Patches should only be created for serious issues or test-only changes.
  • Patches should not break backwards-compatibility.
  • Patches should change as little code as possible.
  • Patches should not change on-disk formats or node communication protocols.
  • Patches should not add new functionality.
  • Patches must not add, edit, or otherwise modify cluster versions; or add version gates.
If some of the basic criteria cannot be satisfied, ensure that the exceptional criteria are satisfied within.
  • There is a high priority need for the functionality that cannot wait until the next release and is difficult to address in another way.
  • The new functionality is additive-only and only runs for clusters which have specifically “opted in” to it (e.g. by a cluster setting).
  • New code is protected by a conditional check that is trivial to verify and ensures that it only runs for opt-in clusters.
  • The PM and TL on the team that owns the changed code have signed off that the change obeys the above rules.

Add a brief release justification to the body of your PR to justify this backport.

Some other things to consider:

  • What did we do to ensure that a user that doesn’t know & care about this backport, has no idea that it happened?
  • Will this work in a cluster of mixed patch versions? Did we test that?
  • If a user upgrades a patch version, uses this feature, and then downgrades, what happens?

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@tbg tbg requested a review from erikgrinaker July 5, 2023 11:20
@tbg
Copy link
Member Author

tbg commented Jul 6, 2023

#106267

@tbg tbg closed this Jul 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants