MyDataStore

MyDataStore is a leaderless distributed database designed for research and learning purposes, drawing inspiration from "Designing Data-Intensive Applications" by Martin Kleppmann. Key advantages include:

High Availability and Fault Tolerance: The absence of a central leader node eliminates single point of failure and distributing data across multiple nodes, the system remains operational even if some nodes fail.
Leaderless Consensus: Implmentation of the raft consensus algorithm is used to agree upon cluster organization.
Variable Consistency: Read and write quorums are configurable, ensuring consistency, fast reads, or faster writes."

Core Concepts

MyDataStore implements several core concepts from the book "Designing Data-Intensive Applications". These include:

Node Discovery:
- Node discovery is implemented using a gossip protocol, which enables efficient and scalable membership validation and communication.
Consistent Hashing:
- A hashring technique is used to evenly distribute data into partitions and distribute partitions across the cluster.
- The cluster leader determins the member list, the following members replciate the hashring ensuring a consistent cluster hashing.
Quorums and Read Repairs:
- The system uses a read quorum and a write quorum for operations.
- - here N is number of replciated partitions, W is minimum nodes used to write, and R is minimum nodes used to read.
- - R + W > N is used to ensure consistency and no stale reads.
- - Lower W can be used for faster writes.
- - Lower R can be used for faster reads.
- Read repairs are used to write to nodes which return stale data in a read operation.
Timestamps for Conflict Resolution:
- Timestamp-based conflict resolution is used to handle conflicts in the system.
Using Raft for Consensus:
- Raft algorithm is used to elect a leader for the cluster through a democratic election process.
- The elected leader maintains a log that is verified by the consensus algorithm to ensure data integrity and consistency.
- In this project, the log is used to keep track of the members currently in the cluster and the current epoch of the cluster, providing a reliable source of truth for the system state.
Periodic Partition Verification and Replication:
- Data replication and verification is a process that ensures data consistency and integrity across the distributed system.
- The raft leader periodically increments a varaible called Epoch.
- Epochs are stored with the value write.
- Upon Epoch increment, each node creates a merkle tree for (Partition, Epoch, Data) then starts comparing the tree hash to the hash of other members.
- Additional details:
- - Merkle trees organize data into hashed buckets on key.
- - Validation nodes will construct a remote tree with limied data of tree hash.
- - - This is done so that all data does not need to be checked otherwise the process is redudant and takes a long time.
- - BFS is used on local and remote tree to find buckets which are not in sync.
- - A node can then request the out of sync bucket range instead of the whole partition. This speeds up the process of partition replication if out of sync. For a given partition and epoch, of a 1/64 values will need to be sync in this range.
Data Storage Engine and Indexing:
- The system works with Badger and LevelDB as its data storage engines.
- Both Badger and LevelDB are Log-Structured Merge-tree (LSM tree) based data structures, which provide high write throughput.
- Indexes:
- - (epoch, parition, key)
- - (key)
Kubernetes operator
- Assists autoscaling.

Development

Prerequisites

Bazel: Bazel is a build tool that is required for building and testing MyDataStore. You can install Bazel following the instructions on the official Bazel website.

Dev Script

A dev script is provided to help with common commands. The script is written in bash and can be run from the command line. Here is a brief explanation of the commands supported by the script:

dc: Starts a new tmuxinator session with the configuration specified in tmuxinator-dc.yaml.
dc!: Kills the dc_dev tmux session.
k8-init: Deletes and restarts minikube, enables the metrics-server addon, and runs the k8-r command.
k8-r: Deletes the Kubernetes configuration specified in ./operator/config/samples/, deploys the operator, and deploys Kubernetes.
k8-operator: Deploys the operator.
k8: Starts a new tmuxinator session with the configuration specified in tmuxinator-k8.yaml.
k8!: Kills the k8_dev tmux session.
k8-e2e: Runs end-to-end tests on Kubernetes.
dc-e2e: Runs end-to-end tests using k6.
test: Executes Bazel tests once. Accepts a test name as an argument for filtering.
rtest: Uses ibazel to rerun tests on file change. Also accepts a test name as an argument for filtering.
deps: Updates dependencies using update_deps.sh.
dc-pprof: Starts a pprof profiling session.
scale: Scales the number of replicas for the store. The number of replicas must be specified as an argument.

To run the script, use the following command:

./dev.sh <command> [arguments]

Replace with one of the commands listed above, and [arguments] with any arguments required by the command.

EDITOR SETUP

https://github.com/bazelbuild/rules_go/wiki/Editor-setup

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

MyDataStore

Core Concepts

Development

Prerequisites

Dev Script

EDITOR SETUP

Files

README.md

Latest commit

History

README.md

File metadata and controls

MyDataStore

Core Concepts

Development

Prerequisites

Dev Script

EDITOR SETUP