forked from cockroachdb/cockroach
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
storage: introduce dedicated raft storage
Implements cockroachdb#16361. This is a breaking change. To see why consider that prior to this we stored all consensus data in addition to all system metadata and user level keys in the same, single RocksDB instance. Here we introduce a separate, dedicated instance for raft data (log entries and HardState). Cockroach nodes simply restarting with these changes, unless migrated properly, will fail to find the most recent raft long entries and HardState data in the new RocksDB instance. Also consider a cluster running mixed versions (nodes with dedicated raft storage and nodes without), what would the communication between nodes here like in light of proposer evaluated KV? Current we propagate a storagebase.WriteBatch through raft containing a serialized representation of a RocksDB write batch, this models the changes to be made to the single underlying RocksDB instance. For log truncation requests where we delete log entries and/or admin splits where we write initial HardState for newly formed replicas, we need to similarly propagate a write batch (through raft) addressing the new RocksDB instance (if the recipient node is one with these changes) or the original RocksDB instance (if the recipient node is one without these changes). What if an older version node is the raft leader and is therefore the one upstream of raft, propagating storagebase.WriteBatches with raft data changes but addressed to the original RocksDB instance? What would rollbacks look like? To this end we introduce three modes of operation, transitioningRaftStorage and enabledRaftStorage (this is implicit if we're not in transitioning mode). We've made it so that it is safe to transition between an older cockroach version to transitioningRaftStorage, from transitioningRaftStorage to enabled and the reverse for rollbacks. Transition from one mode to the next will take place when all the nodes in the cluster are on the same previous mode. The operation mode is set by an env var COCKROACH_DEDICATED_RAFT_STORAGE={DISABLED,TRANSITIONING,ENABLED} - In the old version we use a single RocksDB instance for both raft and user-level KV data - In transitioningRaftStorage mode we use both RocksDB instances for raft data interoperably, the raft specific and the regular instance. We use this mode to facilitate rolling upgrades - In enabled mode we use the dedicated RocksDB instance for raft data. Raft log entries and the HardState are stored on this instance alone Most of this commit is careful plumbing of an extra engine.{Engine,Batch,Reader,Writer,ReadWriter} for whenever we need to interact with the new RocksDB instance.
- Loading branch information
1 parent
2d9450b
commit fb4ce33
Showing
30 changed files
with
948 additions
and
265 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.