Before removing a node from your cluster, you must first decommission the node. This lets a node finish in-flight requests, rejects any new requests, and transfers all range replicas and range leases off the node.
{{site.data.alerts.callout_danger}} If you remove nodes without first telling CockroachDB to decommission them, you may cause data or even cluster unavailability. For more details about how this works and what to consider before removing nodes, see Decommission Nodes. {{site.data.alerts.end}}
-
Use the
cockroach node status
command to get the internal IDs of nodes. For example, if you followed the steps in Deploy CockroachDB with Kubernetes to launch a secure client pod, get a shell into thecockroachdb-client-secure
pod:{% include copy-clipboard.html %}
$ kubectl exec -it cockroachdb-client-secure \ -- ./cockroach node status \ --certs-dir=/cockroach-certs \ --host=cockroachdb-public
id | address | build | started_at | updated_at | is_available | is_live +----+---------------------------------------------------------------------------------+--------+----------------------------------+----------------------------------+--------------+---------+ 1 | cockroachdb-0.cockroachdb.default.svc.cluster.local:26257 | {{page.release_info.version}} | 2018-11-29 16:04:36.486082+00:00 | 2018-11-29 18:24:24.587454+00:00 | true | true 2 | cockroachdb-2.cockroachdb.default.svc.cluster.local:26257 | {{page.release_info.version}} | 2018-11-29 16:55:03.880406+00:00 | 2018-11-29 18:24:23.469302+00:00 | true | true 3 | cockroachdb-1.cockroachdb.default.svc.cluster.local:26257 | {{page.release_info.version}} | 2018-11-29 16:04:41.383588+00:00 | 2018-11-29 18:24:25.030175+00:00 | true | true 4 | cockroachdb-3.cockroachdb.default.svc.cluster.local:26257 | {{page.release_info.version}} | 2018-11-29 17:31:19.990784+00:00 | 2018-11-29 18:24:26.041686+00:00 | true | true (4 rows)
The pod uses the
root
client certificate created earlier to initialize the cluster, so there's no CSR approval required. -
Use the
cockroach node decommission
command to decommission the node with the highest number in its address, specifying its ID (in this example,4
):{{site.data.alerts.callout_info}} It's important to decommission the node with the highest number in its address because, when you reduce the replica count, Kubernetes will remove the pod for that node. {{site.data.alerts.end}}
{% include copy-clipboard.html %}
$ kubectl exec -it cockroachdb-client-secure \ -- ./cockroach node decommission 4 \ --certs-dir=/cockroach-certs \ --host=cockroachdb-public
You'll then see the decommissioning status print to
stderr
as it changes:id | is_live | replicas | is_decommissioning | is_draining +---+---------+----------+--------------------+-------------+ 4 | true | 73 | true | false (1 row)
Once the node has been fully decommissioned and stopped, you'll see a confirmation:
id | is_live | replicas | is_decommissioning | is_draining +---+---------+----------+--------------------+-------------+ 4 | true | 0 | true | false (1 row) No more data reported on target nodes. Please verify cluster health before removing the nodes.
-
Once the node has been decommissioned, scale down your StatefulSet:
{% include copy-clipboard.html %}
$ kubectl scale statefulset cockroachdb --replicas=3
statefulset.apps/cockroachdb scaled
-
Verify that the pod was successfully removed:
{% include copy-clipboard.html %}
$ kubectl get pods
NAME READY STATUS RESTARTS AGE cockroachdb-0 1/1 Running 0 51m cockroachdb-1 1/1 Running 0 47m cockroachdb-2 1/1 Running 0 3m cockroachdb-client-secure 1/1 Running 0 15m ...
-
You should also remove the persistent volume that was mounted to the pod. Get the persistent volume claims for the volumes:
{% include copy-clipboard.html %}
$ kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE datadir-cockroachdb-0 Bound pvc-75dadd4c-01a1-11ea-b065-42010a8e00cb 100Gi RWO standard 17m datadir-cockroachdb-1 Bound pvc-75e143ca-01a1-11ea-b065-42010a8e00cb 100Gi RWO standard 17m datadir-cockroachdb-2 Bound pvc-75ef409a-01a1-11ea-b065-42010a8e00cb 100Gi RWO standard 17m datadir-cockroachdb-3 Bound pvc-75e561ba-01a1-11ea-b065-42010a8e00cb 100Gi RWO standard 17m
-
Verify that the PVC with the highest number in its name is no longer mounted to a pod:
{% include copy-clipboard.html %}
$ kubectl describe pvc datadir-cockroachdb-3
Name: datadir-cockroachdb-3 ... Mounted By: <none>
-
Remove the persistent volume by deleting the PVC:
{% include copy-clipboard.html %}
$ kubectl delete pvc datadir-cockroachdb-3
persistentvolumeclaim "datadir-cockroachdb-3" deleted