This repository contains formal models of core Elasticsearch algorithms and is directly related to ongoing implementation efforts around data replication and cluster consensus. We consider this work-in-progress: Models as well as implementations are still evolving and might differ in substantial ways. The formal models mainly serve to illustrate some of the high-level concepts and help to validate resiliency-related aspects.
The data replication model consists of two files:
The cluster consensus TLA+ model consists of two files:
The cluster consensus Isabelle model consists of the following theories:
- Basic definitions
- An implementation in functional style
- An implementation in monadic style, along with a proof it's equivalent to the previous
- The proof that each slot is consistent, based on Lamport's Synod algorithm
- The proof that the implementation ensures consistency
A TLA+ model of how the engine handles replication requests.
An alternative idea for improving the consistency of cluster state updates, effectively by adding the notion of a term to the existing Zen algorithm.
- Install the TLA Toolbox
- If on Mac OS, move the downloaded app to the Applications folder first
- Read some documentation
How to run the model checker in headless mode:
- Download tla2tools.jar
- Run the model checker once in TLA+ Toolbox on desktop (can be aborted once started). This generates the folder
elasticsearch.toolbox/model/
that contains all model files that are required to run the model checker in headless mode. - Copy the above folder and
tla2tools.jar
to the server running in headless mode. cd
to the folder and runjava -Xmx30G -cp ../tla2tools.jar tlc2.TLC MC -deadlock -workers 12
. The setting-Xmx30G
denotes the amount of memory to allocate to the model checker and-workers 12
the number of worker threads (should be equal to the number of cores on machine). The setting-deadlock
ensures that TLC explores the full reachable state space, not searching for deadlocks.