-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kvserver: add ReplicaID to the persistent raft state #75740
Labels
A-kv-replication
Relating to Raft, consensus, and coordination.
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
Comments
sumeerbhola
added
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
A-kv-replication
Relating to Raft, consensus, and coordination.
labels
Jan 31, 2022
sumeerbhola
added a commit
to sumeerbhola/cockroach
that referenced
this issue
Feb 1, 2022
The RaftReplicaIDKey is an unreplicated range-id local key that contains the ReplicaID of the replica whose HardState is represented in the RaftHardStateKey. These two keys share the same lifetime, and are removed atomically when we clear the range-id local keys for a replica. We currently do not utilize this information on node restart to figure out whether we should cleanup stale uninitialized replicas. Doing such cleanup can wait until we implement and start using ReplicasStorage. The change here is meant to set us up to rely on RaftReplicaID from the next release onwards. Informs cockroachdb#75740 Release note: None
sumeerbhola
added a commit
to sumeerbhola/cockroach
that referenced
this issue
Feb 3, 2022
The RaftReplicaIDKey is an unreplicated range-id local key that contains the ReplicaID of the replica whose HardState is represented in the RaftHardStateKey. These two keys share the same lifetime, and are removed atomically when we clear the range-id local keys for a replica. We currently do not utilize this information on node restart to figure out whether we should cleanup stale uninitialized replicas. Doing such cleanup can wait until we implement and start using ReplicasStorage. The change here is meant to set us up to rely on RaftReplicaID from the next release onwards. Informs cockroachdb#75740 Release note: None
sumeerbhola
added a commit
to sumeerbhola/cockroach
that referenced
this issue
Feb 4, 2022
The RaftReplicaIDKey is an unreplicated range-id local key that contains the ReplicaID of the replica whose HardState is represented in the RaftHardStateKey. These two keys are removed atomically when we clear the range-id local keys for a replica. See store_create_replica.go for a detailed comment on correctness and version compatibility. We currently do not utilize this information on node restart to figure out whether we should cleanup stale uninitialized replicas. Doing such cleanup can wait until we implement and start using ReplicasStorage. The change here is meant to set us up to rely on RaftReplicaID from the next release onwards. Informs cockroachdb#75740 Release note: None
sumeerbhola
added a commit
to sumeerbhola/cockroach
that referenced
this issue
Feb 4, 2022
The RaftReplicaIDKey is an unreplicated range-id local key that contains the ReplicaID of the replica whose HardState is represented in the RaftHardStateKey. These two keys are removed atomically when we clear the range-id local keys for a replica. See store_create_replica.go for a detailed comment on correctness and version compatibility. We currently do not utilize this information on node restart to figure out whether we should cleanup stale uninitialized replicas. Doing such cleanup can wait until we implement and start using ReplicasStorage. The change here is meant to set us up to rely on RaftReplicaID from the next release onwards. Informs cockroachdb#75740 Release note: None
craig bot
pushed a commit
that referenced
this issue
Feb 4, 2022
75761: keys,kvserver: introduce RaftReplicaID r=tbg a=sumeerbhola The RaftReplicaIDKey is an unreplicated range-id local key that contains the ReplicaID of the replica whose HardState is represented in the RaftHardStateKey. These two keys are removed atomically when we clear the range-id local keys for a replica. See store_create_replica.go for a detailed comment on correctness and version compatibility. We currently do not utilize this information on node restart to figure out whether we should cleanup stale uninitialized replicas. Doing such cleanup can wait until we implement and start using ReplicasStorage. The change here is meant to set us up to rely on RaftReplicaID from the next release onwards. Informs #75740 Release note: None Co-authored-by: sumeerbhola <[email protected]>
What's left here @sumeerbhola? |
The writing of ReplicaID is done.
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
A-kv-replication
Relating to Raft, consensus, and coordination.
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
We currently do not store the ReplicaID when instantiating an uninitialized replica. We do know the ReplicaID when creating the corresponding uninitialized
Replica
object, since we pass it tonewUnloadedReplica
. Without the ReplicaID, if the node were to crash before the replica were initialized, the restarting node will haveHardState
without any knowledge of whatReplicaID
it corresponds to. This creates difficulties with theReplicaStorage
design and can create a leak (see details below).(copy-paste of conversation)
[sumeer] while implementing ReplicasStorage.Init I realized something I had missed earlier: for an UninitializedStateMachine replica, there is nothing in the engine that contains its ReplicaID.
[tbg] Yeah, there’s always been something odd about the replicaID not getting stored. There’s definitely a leak, if you have an uninitialized replica and the node restarts, nothing will ever instantiate this replica in-memory again (unless it receives another message, which is doubtful at that point) but also if we did, we wouldn’t know the replicaID. So even if we tried, we couldn’t destroy this state, as we can’t ever be sure that it’s not still needed. This is definitely leaking today.
Seems kind of clear that we want the replicaID to be stored. Seems like it would have to be on the raft engine, parallel to the HardState (i.e. store ReplicaID whenever it changes, which should imply doing it on the first write to the HardState and never again since replicas can’t exist across replicaIDs).
cc: @tbg, @erikgrinaker
Epic: CRDB-220
Jira issue: CRDB-12811
The text was updated successfully, but these errors were encountered: