Skip to content

Commit

Permalink
Replication layer docs: add load-based rebalancing
Browse files Browse the repository at this point in the history
Fixes #2051.

Summary of changes:

- Add a paragraph to *Architecture > Replication Layer* describing that
  as of v2.1, in addition to the rebalancing that occurs when nodes are
  added or removed, we also rebalance leases and replicas based on load.
  Also added links to relevant cluster settings and zone config docs for
  those who want more info.
  • Loading branch information
rmloveland committed Oct 26, 2018
1 parent f30bfb4 commit 9a7b531
Showing 1 changed file with 4 additions and 6 deletions.
10 changes: 4 additions & 6 deletions v2.1/architecture/replication-layer.md
Original file line number Diff line number Diff line change
Expand Up @@ -82,15 +82,13 @@ Your table's meta and system ranges (detailed in the distribution layer) are tre

Whenever there are changes to a cluster's number of nodes, the members of Raft groups change and, to ensure optimal survivability and performance, replicas need to be rebalanced. What that looks like varies depending on whether the membership change is nodes being added or going offline.

**Nodes added**: The new node communicates information about itself to other nodes, indicating that it has space available. The cluster then rebalances some replicas onto the new node.
- **Nodes added**: The new node communicates information about itself to other nodes, indicating that it has space available. The cluster then rebalances some replicas onto the new node.

**Nodes going offline**: If a member of a Raft group ceases to respond, after 5 minutes, the cluster begins to rebalance by replicating the data the downed node held onto other nodes.
- **Nodes going offline**: If a member of a Raft group ceases to respond, after 5 minutes, the cluster begins to rebalance by replicating the data the downed node held onto other nodes.

#### Rebalancing replicas
Rebalancing is achieved by using a snapshot of a replica from the leaseholder, and then sending the data to another node over [gRPC](distribution-layer.html#grpc). After the transfer has been completed, the node with the new replica joins that range's Raft group; it then detects that its latest timestamp is behind the most recent entries in the Raft log and it replays all of the actions in the Raft log on itself.

When CockroachDB detects a membership change, ultimately, replicas are moved between nodes.

This is achieved by using a snapshot of a replica from the leaseholder, and then sending the data to another node over [gRPC](distribution-layer.html#grpc). After the transfer has been completed, the node with the new replica joins that range's Raft group; it then detects that its latest timestamp is behind the most recent entries in the Raft log and it replays all of the actions in the Raft log on itself.
<span class="version-tag">New in v2.1:</span> In addition to the rebalancing that occurs when nodes join or leave a cluster, leases and replicas are rebalanced automatically based on the relative load across the nodes within a cluster. For more information, see the `kv.allocator.load_based_rebalancing` and `kv.allocator.qps_rebalance_threshold` [cluster settings](../cluster-settings.html). Note that depending on the needs of your deployment, you can exercise additional control over the location of leases and replicas by [configuring replication zones](../configure-replication-zones.html).

## Interactions with other layers

Expand Down

0 comments on commit 9a7b531

Please sign in to comment.