From 64e718fa417ecea36c738f0e6850db8585e9973f Mon Sep 17 00:00:00 2001 From: Rich Loveland <rich@cockroachlabs.com> Date: Wed, 14 Mar 2018 11:11:49 -0400 Subject: [PATCH] First crack at 1.1.{5,6} shutdown updates --- v1.1/stop-a-node.md | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/v1.1/stop-a-node.md b/v1.1/stop-a-node.md index ac9737b95a3..a1746b2bd7f 100644 --- a/v1.1/stop-a-node.md +++ b/v1.1/stop-a-node.md @@ -14,6 +14,11 @@ For information about permanently removing nodes to downsize a cluster or react ### How It Works +- Finishes in-flight requests. Note that this is a best effort that times out after the duration specified by the `???` cluster setting (1.1.5) +- Transfers all *range leases* and Raft leadership to other nodes. (1.1.6) +- Gossips its draining state to the cluster so that no leases are transferred to the draining node. Note that this is a best effort that times out after the duration specified by the `???` cluster setting, so other nodes may not receive the gossip info in time. (1.1.6) +- No new ranges are transferred to the draining node, to avoid a possible loss of quorum after the node shuts down. (1.1.5) + When you stop a node, CockroachDB lets the node finish in-flight requests and transfers all **range leases** off the node before shutting it down. If the node then stays offline for a certain amount of time (5 minutes by default), the cluster considers the node dead and starts to transfer its **range replicas** to other nodes as well. After that, if the node comes back online, its range replicas will determine whether or not they are still valid members of replica groups. If a range replica is still valid and any data in its range has changed, it will receive updates from another replica in the group. If a range replica is no longer valid, it will be removed from the node.