This is the publicly available artifact repository supporting the ASPLOS'20 paper "Hermes: A Fast, Fault-Tolerant and Linearizable Replication Protocol". The repository contains both code to experimentally evaluate Hermes(KV) and complete Hermes TLA+ specifications which can be used to verify Hermes correctness via model-checking.
@inbook{Katsarakis:20,
author = {Katsarakis, Antonios and Gavrielatos, Vasilis and Katebzadeh, M.R. Siavash and Joshi, Arpit and Dragojevic, Aleksandar and Grot, Boris and Nagarajan, Vijay},
title = {Hermes: A Fast, Fault-Tolerant and Linearizable Replication Protocol},
year = {2020},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
booktitle = {Proceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Systems},
pages = {201–217},
numpages = {17}
}
- Reads: i) Local ii) Load-balanced (served by any replica)
- Updates (Writes and RMWs): i) Inter-key concurrent ii) Decentralized iii) Fast (1rtt commit -- any replica)
- Writes: iv) Non-conflicting (i.e., never abort)
Linearizable reads, writes and RMWs with the following properties:
- Writes: from a live replica always commit after Invalidating (and getting acknowledgments from) the rest live replicas.
- RMWs: at most one of possible concurrent RMWs to a key can commit, and this only once all acknowledgments from live replicas are gathered.
- Reads: return the local value if the targeted keys are found in the Valid state and the coordinator was considered live at the time of reading. The later can be ensured locally if the coordinator has a lease for (and is part of) the membership.
Coupling Invalidations with per-key logical timestamps (i.e., Lamport clocks) and propagating the value to be updated with the invalidation message (early value propagation), Hermes allows any replica blocked by an update (write or RMW) to safely replay the update and unblock it self and the rest of followers.
A homogeneous cluster of x86_64 nodes interconnected via RDMA network cards and switched (tested on "Mellanox ConnectX-4" Infiniband infrastructure).
Linux OS (tested on Ubuntu 18.04 4.15.0-55-generic) with root access.
The software is tested using the following version of Mellanox OFED RDMA drivers
MLNX_OFED_LINUX-4.4-2.0.7.0
.
Third-party libraries that you will require to run the experiments include:
- parallel (Cluster management scripts only)
- libmemcached-dev (used to exchange QP informations for the setup of RDMA connections)
- libnuma-dev (for mbind)
On every node:
- Install Mellanox OFED ibverbs drivers
./hermes/bin/setup.sh
On manager (just pick on node in the cluster):
- Fill variables in
/hermes/exec/hosts.sh
- Configure setup and default parameters in
/hermes/include/hermes/config.h
- From
/hermes/exec/
compile hermesKV through make - scp hermesKV and the configured hosts.sh in the
/hermes/exec/
directory of all other nodes in the cluster.
cd hermes/exec; make
Warning: Do not compile through cmake; instead use the Makefile in exec/ directory.
Run first on manager:
./run-hermesKV.sh <experiment_parameters>
Then run on all other member nodes
./run-hermesKV.sh <experiment_parameters>
Note that some members will eagerly terminate if experiment uses smaller number of nodes than specified in hosts.sh
An experiment example for three nodes 12 worker threads and 35% write ratio would be as follows:
./run-hermesKV.sh -W 12 -w 350 -M 3
Supported command-line arguments for the experiments are detailed in the run-hermesKV.sh script.
Hermes is based on HERD/MICA design as an underlying KVS, the code of which we have adapted to implement HermesKV.
- Odyssey - Hermes is also implemed in the Odyssey framework by Vasilis Gavrielatos
- Olympus - in Rust by Thomas Bracher
Antonios Katsarakis: antonis.io
| [email protected]