-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[BACKPORT 2.15.0][#13042] docdb: fixed clearing pending config for ab…
…orted CONFIG_CHANGE_OP Summary: 1. Before KUDU-1330 port (9305a20) we had a sanity check for pending config to be set inside`ReplicaState::SetCommittedConfigUnlocked`: ``` Status ReplicaState::SetCommittedConfigUnlocked(const RaftConfigPB& committed_config) { TRACE_EVENT0("consensus", "ReplicaState::SetCommittedConfigUnlocked"); DCHECK(IsLocked()); DCHECK(committed_config.IsInitialized()); RETURN_NOT_OK_PREPEND(VerifyRaftConfig(committed_config, COMMITTED_QUORUM), "Invalid config to set as committed"); // Compare committed with pending configuration, ensure they are the same. // Pending will not have an opid_index, so ignore that field. DCHECK(cmeta_->has_pending_config()); // ^^^^^^^ ``` Mentioned commit removed this sanity check. 2. Before d26017d we had some missed Raft operations callback invocations during abort. This commit added sanity check that we call Raft operation callback exactly once and also added missed callback invocations. 3. Very old fix 2051e04 implemented pending config change on aborting CHANGE_CONFIG_OP: ``` void RaftConsensus::NonTrackedRoundReplicationFinished(ConsensusRound* round, const StdStatusCallback& client_cb, const Status& status) { ... if (!status.ok()) { // TODO: Do something with the status on failure? LOG_WITH_PREFIX(INFO) << op_str << " replication failed: " << status << "\n" << GetStackTrace(); // Clear out the pending state (ENG-590). if (IsChangeConfigOperation(op_type)) { WARN_NOT_OK(state_->ClearPendingConfigUnlocked(), "Could not clear pending state"); // ^^^^^^^ } } else if (IsChangeConfigOperation(op_type)) { // Notify the TabletPeer owner object. state_->context()->ChangeConfigReplicated(state_->GetCommittedConfigUnlocked()); } ... } ``` If we receive two subsequent CHANGE_CONFIG_OP at the follower (for example this happens during `RemoteBootstrapITest.TestLeaderCrashesBeforeChangeRoleKeyValueTableType`), `RaftConsensus::StartReplicaOperationUnlocked` succeeds for the first CHANGE_CONFIG_OP and sets pending config. For the second CHANGE_CONFIG_OP it fails with error: ``` [ts-1] I0623 19:10:15.999151 189487 replica_state.cc:646] T 6afaa10647e644dd91fba0a32ced35b5 P 1da13ee035b044858177326059575579 [term 2 FOLLOWER]: Illegal state (yb/consensus/replica_state.cc:323): RaftConfig change currently pending. Only one is allowed at a time. ``` Due to combination of changes 2 and 3 mentioned above, callback is invoked and clears pending state. After that Raft starts to apply first CHANGE_CONFIG_OP and it without change 1 it would fail on DCHECK. But in version 2.14 we had all 3 changes in order: 3, 1, 2, so DCHECK was removed and we didn't fail due to broken invariant (we should have pending config by the time we apply CHANGE_CONFIG_OP). In version 2.12 we only have change 3 and when I started to backport change 2 to 2.12, mentioned test started to fail, because DCHECK is still there. *Solution* 1. In order to keep sanity check for maintaining invariant DCHECK should be restored, but updated to `DCHECK(cmeta_->has_pending_config() || config_to_commit.unsafe_config_change());` because we only allow to not have pending config in case of unsafe config change. 2. On aborting CONFIG_CHANGE_OP it is not correct to clear pending config if it was set by another Raft operation, we should only clear pending config if it was set by operation we are aborting. Original commit: 84e980a / D17907 Test Plan: RemoteBootstrapITest.TestLeaderCrashesBeforeChangeRoleKeyValueTableType Reviewers: amitanand Reviewed By: amitanand Subscribers: bogdan Differential Revision: https://phabricator.dev.yugabyte.com/D17990
- Loading branch information
Showing
8 changed files
with
78 additions
and
35 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters