Skip to content
This repository has been archived by the owner on Jan 22, 2025. It is now read-only.

Update turbine documentation #29906

Merged
merged 5 commits into from
Jan 27, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions docs/art/data-plane-fanout.bob
Original file line number Diff line number Diff line change
Expand Up @@ -3,13 +3,13 @@
| +-----------------+ Neighborhood 0 +-----------------+ |
| | +--------------------->+ | |
| | Validator 1 | | Validator 2 | |
| | +<---------------------+ | |
| | Root | | | |
| +--------+-+------+ +------+-+--------+ |
| | | | | |
| | +-----------------------------+ | | |
| | +------------------------+------+ | |
| | | | | |
+------------------------------------------------------------------+
+------------|------|------------------------|--------|------------+
| | | |
v v v v
+---------+------+---+ +-+--------+---------+
Expand Down
43 changes: 23 additions & 20 deletions docs/art/data-plane-neighborhood.bob
Original file line number Diff line number Diff line change
@@ -1,25 +1,28 @@
+---------------------------------------------------------------------------------------------------------+
| Neighborhood Above |
| |
| +----------------+ +----------------+ +----------------+ +----------------+ |
| | +------>+ +------>+ +------>+ | |
| +-----------------+-----------------------+-------------------------+ |
| | | | | |
| | v v v |
| +--------------+-+ +----------------+ +----------------+ +----------------+ |
| | | | | | | | | |
| | Neighbor 1 | | Neighbor 2 | | Neighbor 3 | | Neighbor 4 | |
| | +<------+ +<------+ +<------+ | |
| +--+-------------+ +--+-------------+ +-----+----------+ +--+-------------+ |
| | | | | |
+---------------------------------------------------------------------------------------------------------+
| | | |
| | | |
| | | |
| | | |
| | | |
+---------------------------------------------------------------------------------------------------------+
| | | Neighborhood Below | | |
| v v v v |
| +--+-------------+ +--+-------------+ +-----+----------+ +--+-------------+ |
| | +------>+ +------>+ +------>+ | |
| | Anchor | | | | | | | |
| +--+-------------+ +---+------------+ +------+---------+ +---+------------+ |
| | | | | |
+---------|-------------------------|---------------------------|---------------------|-------------------+
| | | |
| | | |
| | | |
| | | |
+---------|-------------------------|---------------------------|---------------------|-------------------+
| | | Neighborhood Below | | |
| v v v v |
| +--+-------------+ +---+------------+ +------+---------+ +---+------------+ |
| | | | | | | | | |
| | Neighbor 1 | | Neighbor 2 | | Neighbor 3 | | Neighbor 4 | |
| | +<------+ +<------+ +<------+ | |
| +----------------+ +----------------+ +----------------+ +----------------+ |
| |
| | Anchor | | | | | | | |
| +--------------+-+ +----------------+ +----------------+ +----------------+ |
| | ^ ^ ^ |
| | | | | |
| +-----------------+-----------------------+-------------------------+ |
+---------------------------------------------------------------------------------------------------------+
16 changes: 8 additions & 8 deletions docs/art/data-plane-seeding.bob
Original file line number Diff line number Diff line change
@@ -1,15 +1,15 @@
+--------------+
| |
+------------+ Leader +------------+
| | | |
| +--------------+ |
v v
+------------+----------------------------------------+------------+
| |
| +-----------------+ Neighborhood 0 +-----------------+ |
+------------+ Leader |
| | |
| +--------------+
|
+------------|-----------------------------------------------------+
| v |
| +--------+--------+ Neighborhood 0 +-----------------+ |
| | +--------------------->+ | |
| | Validator 1 | | Validator 2 | |
| | +<---------------------+ | |
| | Root | | | |
| +-----------------+ +-----------------+ |
| |
+------------------------------------------------------------------+
58 changes: 30 additions & 28 deletions docs/src/cluster/turbine-block-propagation.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,41 +2,49 @@
title: Turbine Block Propagation
---

A Solana cluster uses a multi-layer block propagation mechanism called _Turbine_ to broadcast transaction shreds to all nodes with minimal amount of duplicate messages. The cluster divides itself into small collections of nodes, called _neighborhoods_. Each node is responsible for sharing any data it receives with the other nodes in its neighborhood, as well as propagating the data on to a small set of nodes in other neighborhoods. This way each node only has to communicate with a small number of nodes.

During its slot, the leader node distributes shreds between the validator nodes in the first neighborhood \(layer 0\). Each validator shares its data within its neighborhood, but also retransmits the shreds to one node in some neighborhoods in the next layer \(layer 1\). The layer-1 nodes each share their data with their neighborhood peers, and retransmit to nodes in the next layer, etc, until all nodes in the cluster have received all the shreds.
A Solana cluster uses a multi-layer block propagation mechanism called _Turbine_ to broadcast transaction shreds to all nodes with minimal amount of duplicate messages. The cluster divides itself into small collections of nodes, called _neighborhoods_. Each node is responsible for propagating any data it receives on to a small set of nodes in downstream neighborhoods and possibly sharing data with the other nodes in its neighborhood. This way each node only has to communicate with a small number of nodes.
bw-solana marked this conversation as resolved.
Show resolved Hide resolved

## Neighborhood Assignment - Weighted Selection

In order for data plane fanout to work, the entire cluster must agree on how the cluster is divided into neighborhoods. To achieve this, all the recognized validator nodes \(the TVU peers\) are sorted by stake and stored in a list. This list is then indexed in different ways to figure out neighborhood boundaries and retransmit peers. For example, the leader will simply select the first nodes to make up layer 0. These will automatically be the highest stake holders, allowing the heaviest votes to come back to the leader first. Layer 0 and lower-layer nodes use the same logic to find their neighbors and next layer peers.
In order for data plane fanout to work, the entire cluster must agree on how the cluster is divided into neighborhoods. To achieve this, all the recognized validator nodes \(the TVU peers\) are sorted by stake and stored in a list. This list is then indexed in different ways to figure out neighborhood boundaries and retransmit peers. For example, the leader will simply select the first `DATA_PLANE_FANOUT` nodes to make up layer 1. These will automatically be the highest stake holders, allowing the heaviest votes to come back to the leader first. Layer 1 and lower-layer nodes use the same logic to find their neighbors and next layer peers.

To reduce the possibility of attack vectors, each shred is transmitted over a random tree of neighborhoods. Each node uses the same set of nodes representing the cluster. A random tree is generated from the set for each shred using a seed derived from the leader id, slot and shred index.
To reduce the possibility of attack vectors, each shred is transmitted over a random tree of neighborhoods. Each node uses the same set of nodes representing the cluster. A random tree is generated from the set for each shred using a seed derived from the slot leader id, slot, shred index, and shred type.

## Layer and Neighborhood Structure
bw-solana marked this conversation as resolved.
Show resolved Hide resolved

The current leader makes its initial broadcasts to at most `DATA_PLANE_FANOUT` nodes. If this layer 0 is smaller than the number of nodes in the cluster, then the data plane fanout mechanism adds layers below. Subsequent layers follow these constraints to determine layer-capacity: Each neighborhood contains `DATA_PLANE_FANOUT` nodes. Layer 0 starts with 1 neighborhood with fanout nodes. The number of nodes in each additional layer grows by a factor of fanout.
The leader can be thought of as layer 0 and communicates with layer 1, which is made up of at most `DATA_PLANE_FANOUT` nodes. If this layer 1 is smaller than the number of nodes in the cluster, then the data plane fanout mechanism adds layers below. Subsequent layers follow these constraints to determine layer-capacity: Each neighborhood contains `DATA_PLANE_FANOUT` nodes. Layer 1 starts with 1 neighborhood. The number of nodes in each additional neighborhood/layer grows by a factor of `DATA_PLANE_FANOUT`.

As mentioned above, each node in a layer only has to broadcast its shreds to its neighbors and to exactly 1 node in some next-layer neighborhoods, instead of to every TVU peer in the cluster. A good way to think about this is, layer 0 starts with 1 neighborhood with fanout nodes, layer 1 adds fanout neighborhoods, each with fanout nodes and layer 2 will have `fanout * number of nodes in layer 1` and so on.
A good way to think about this is, layer 1 starts with 1 neighborhood with fanout nodes, layer 2 adds fanout neighborhoods, each with fanout nodes and layer 3 will have `fanout * number of nodes in layer 2` and so on.

This way each node only has to communicate with a maximum of `2 * DATA_PLANE_FANOUT - 1` nodes.
The following diagram shows a three layer cluster with a fanout of 2.

The following diagram shows how the Leader sends shreds with a fanout of 2 to Neighborhood 0 in Layer 0 and how the nodes in Neighborhood 0 share their data with each other.
![Two layer cluster with a Fanout of 2](/img/data-plane.svg)

![Leader sends shreds to Neighborhood 0 in Layer 0](/img/data-plane-seeding.svg)
### Configuration Values

The following diagram shows how Neighborhood 0 fans out to Neighborhoods 1 and 2.
`DATA_PLANE_FANOUT` - Determines the size of layer 1. Subsequent layers grow by a factor of `DATA_PLANE_FANOUT`. The number of nodes in a neighborhood is equal to the fanout value. Neighborhoods will fill to capacity before new ones are added, i.e if a neighborhood isn't full, it _must_ be the last one.

![Neighborhood 0 Fanout to Neighborhood 1 and 2](/img/data-plane-fanout.svg)
Currently, configuration is set when the cluster is launched. In the future, these parameters may be hosted on-chain, allowing modification on the fly as the cluster sizes change.

Finally, the following diagram shows a two layer cluster with a fanout of 2.
## Shred Propagation Flow

![Two layer cluster with a Fanout of 2](/img/data-plane.svg)
During its slot, the leader node \(layer 0\) makes its initial broadcasts to a special root node sitting atop the turbine tree. This root node is rotated every shred. The root shares data within its neighborhood \(layer 1\). Nodes in this neighborhood then retransmit shreds to one node in some neighborhoods in the next layer \(layer 2\). In general, the layer-1 root/anchor node (first node in the neighborhood, rotated on every shred) shares their data with their neighborhood peers, and every node in layer-1 retransmits to nodes in the next layer, etc, until all nodes in the cluster have received all the shreds.

### Configuration Values
As mentioned above, each node in a layer only has to broadcast its shreds to exactly 1 node in some next-layer neighborhoods (and to its neighbors if it is the anchor node), instead of to every TVU peer in the cluster. In this way, each node only has to communicate with a maximum of `2 * DATA_PLANE_FANOUT - 1` nodes if it is the anchor node and `DATA_PLANE_FANOUT` if it is not the anchor node.

`DATA_PLANE_FANOUT` - Determines the size of layer 0. Subsequent layers grow by a factor of `DATA_PLANE_FANOUT`. The number of nodes in a neighborhood is equal to the fanout value. Neighborhoods will fill to capacity before new ones are added, i.e if a neighborhood isn't full, it _must_ be the last one.
The following diagram shows how the leader sends shreds with a fanout of 2 to the root from Neighborhood 0 in Layer 1 and how the root from Neighborhood 0 shares its data with its neighbors.

Currently, configuration is set when the cluster is launched. In the future, these parameters may be hosted on-chain, allowing modification on the fly as the cluster sizes change.
![Leader sends shreds to Neighborhood 0 in Layer 1](/img/data-plane-seeding.svg)

The following diagram shows how Neighborhood 0 fans out to Neighborhoods 1 and 2.

![Neighborhood 0 Fanout to Neighborhood 1 and 2](/img/data-plane-fanout.svg)

### Neighborhood Interaction

The following diagram shows how two neighborhoods in different layers interact. To cripple a neighborhood, enough nodes \(erasure codes +1\) from the neighborhood above need to fail. Since each neighborhood receives shreds from multiple nodes in a neighborhood in the upper layer, we'd need a big network failure in the upper layers to end up with incomplete data.

![Inner workings of a neighborhood](/img/data-plane-neighborhood.svg)

## Calculating the required FEC rate

Expand All @@ -56,15 +64,15 @@ on repair to fixup the blocks.
The probability of the shred group failing can be computed using the
binomial distribution. If the FEC rate is `16:4`, then the group size
is 20, and at least 4 of the shreds must fail for the group to fail.
Which is equal to the sum of the probability of 4 or more trails failing
Which is equal to the sum of the probability of 4 or more trials failing
out of 20.

Probability of a block succeeding in turbine:

- Probability of packet failure: `P = 1 - (1 - network_packet_loss_rate)^2`
- FEC rate: `K:M`
- Number of trials: `N = K + M`
- Shred group failure rate: `S = SUM of i=0 -> M for binomial(prob_failure = P, trials = N, failures = i)`
- Shred group failure rate: `S = 1 - (SUM of i=0 -> M for binomial(prob_failure = P, trials = N, failures = i))`
- Shreds per block: `G`
- Block success rate: `B = (1 - S) ^ (G / N)`
- Binomial distribution for exactly `i` results with probability of P in N trials is defined as `(N choose i) * P^i * (1 - P)^(N-i)`
Expand All @@ -79,23 +87,17 @@ With a FEC rate: `16:4`

- `G = 8000`
- `P = 1 - 0.85 * 0.85 = 1 - 0.7225 = 0.2775`
- `S = SUM of i=0 -> 4 for binomial(prob_failure = 0.2775, trials = 20, failures = i) = 0.689414`
- `S = 1 - (SUM of i=0 -> 4 for binomial(prob_failure = 0.2775, trials = 20, failures = i)) = 0.689414`
- `B = (1 - 0.689) ^ (8000 / 20) = 10^-203`

With FEC rate of `16:16`

- `G = 12800`
- `S = SUM of i=0 -> 32 for binomial(prob_failure = 0.2775, trials = 64, failures = i) = 0.002132`
- `S = 1 - (SUM of i=0 -> 16 for binomial(prob_failure = 0.2775, trials = 32, failures = i)) = 0.002132`
- `B = (1 - 0.002132) ^ (12800 / 32) = 0.42583`

With FEC rate of `32:32`

- `G = 12800`
- `S = SUM of i=0 -> 32 for binomial(prob_failure = 0.2775, trials = 64, failures = i) = 0.000048`
- `S = 1 - (SUM of i=0 -> 32 for binomial(prob_failure = 0.2775, trials = 64, failures = i)) = 0.000048`
- `B = (1 - 0.000048) ^ (12800 / 64) = 0.99045`

## Neighborhoods

The following diagram shows how two neighborhoods in different layers interact. To cripple a neighborhood, enough nodes \(erasure codes +1\) from the neighborhood above need to fail. Since each neighborhood receives shreds from multiple nodes in a neighborhood in the upper layer, we'd need a big network failure in the upper layers to end up with incomplete data.

![Inner workings of a neighborhood](/img/data-plane-neighborhood.svg)