-
Notifications
You must be signed in to change notification settings - Fork 4
Fault tolerance
In this section we explore fault tolerance and recovery scenarios when a subset of an Amza cluster loses interconnectivity, often referred to as the split-brain problem. Given Amza’s consistency guarantees based on leadership and quorum, we depict how network partitioning cannot lead to inconsistent writes or data loss.
Amza nodes determine liveliness (interconnectivity and availability) by virtue of publishing a heartbeat and awaiting acknowledgment by a quorum of neighbors. Because this approach requires round-trip interaction between a node and the rest of the ring, both symmetric (left-to-right and right-to-left) and asymmetric (left-to-right but not right-to-left) network partitioning share a common detection and recovery mechanism.
- Leader
- If leader is in the larger half
- Writes continue as usual to the leader.
- If leader is in the smaller half
- Writes continue for configured liveliness interval (default 60 seconds). After which no more writes will be accepted until connectivity is restored.
- If leader is in the larger half
- Leader_quorum
- If leader is in the larger half
- Writes continue as usual to the leader. There may be an initial delay (configurable) due to epidemic escalation (see above)
- If leader is in the smaller half
- Writes to the leader are immediately turned away because it cannot establish quorum. When the larger half detects a leader timeout, a new leader is nominated and elected by quorum. Writes via the new leader may suffer an initial delay due to epidemic escalation.
- If leader is in the larger half
- Quorum (AP)
- If a member is in the larger half
- Writes continue as usual. There may be an initial delay (configurable) due to epidemic escalation (see above)
- If a member is in the smaller half
- Writes are immediately turned away because it cannot establish quorum.
- If a member is in the larger half
- Write_all_read_one
- Writes will fail until connectivity is restored. However, reads will continue as usual.
- Write_one_read_all
- Writes continue as usual. However, reads will fail until connectivity is restored.
- None
- Reads and writes continue as usual.
- Leader - CP
- Reads and write are to a single node and when the node is lost you cannot read or write until the node is restored or another node is manually elected to be the leader.
- Leader_quorum - CP degrading to AP
- Reads and write are to a single node and replicated to a quorum of other nodes. When the leader node is lost you cannot write until liveliness detects the leader is dead and then a new leader is elected and writes resume. During this time if there is a quorum of remaining nodes then reads continue as usual.
- Quorum - AP
- As long as there is a quorum of nodes then reads and write continue usual. Once there is no longer a quorum, all reads and writes will fail.
- Write_all_read_one - AP
- If any node is lost writes fail. As long as there is at least one node reads will continue as usual.
- Write_one_read_all - CP
- As long as there is at least one node write will continue as usual. If any node is lost reads fail.
- None - AP
- As long as there is at least one node read and write will continue as usual.
- Leader, write_one_read_all, none
- Data is written to only one node before returning a success code to the client. Data will eventually be replicated to N other nodes as determined by ring size.
- Leader_quorum, quorum
- Data is written to a quorum of nodes before return ok to client. Data will eventually be replicated to the remaining as determined by ring size.
- Write_all_read_one
- Data is written to all nodes as determined by ring size before return ok to client.
Amza currently supports write isolation in the form of batched transactions.
Amza does not currently support read isolation. (See roadmap below.)
######Used by:
Upena,
Miru,
######Depends on:
Filer,
Jive-Utils,
MLogger