Skip to content

Files

Latest commit

 

History

History
127 lines (99 loc) · 5.73 KB

kubernetes-remove-nodes-manual.md

File metadata and controls

127 lines (99 loc) · 5.73 KB

Before removing a node from your cluster, you must first decommission the node. This lets a node finish in-flight requests, rejects any new requests, and transfers all range replicas and range leases off the node.

{{site.data.alerts.callout_danger}} If you remove nodes without first telling CockroachDB to decommission them, you may cause data or even cluster unavailability. For more details about how this works and what to consider before removing nodes, see Decommission Nodes. {{site.data.alerts.end}}

  1. Use the cockroach node status command to get the internal IDs of nodes. For example, if you followed the steps in Deploy CockroachDB with Kubernetes to launch a secure client pod, get a shell into the cockroachdb-client-secure pod:

    {% include copy-clipboard.html %}

    $ kubectl exec -it cockroachdb-client-secure \
    -- ./cockroach node status \
    --certs-dir=/cockroach-certs \
    --host=cockroachdb-public
      id |               address                                     | build  |            started_at            |            updated_at            | is_available | is_live
    +----+---------------------------------------------------------------------------------+--------+----------------------------------+----------------------------------+--------------+---------+
       1 | cockroachdb-0.cockroachdb.default.svc.cluster.local:26257 | {{page.release_info.version}} | 2018-11-29 16:04:36.486082+00:00 | 2018-11-29 18:24:24.587454+00:00 | true         | true
       2 | cockroachdb-2.cockroachdb.default.svc.cluster.local:26257 | {{page.release_info.version}} | 2018-11-29 16:55:03.880406+00:00 | 2018-11-29 18:24:23.469302+00:00 | true         | true
       3 | cockroachdb-1.cockroachdb.default.svc.cluster.local:26257 | {{page.release_info.version}} | 2018-11-29 16:04:41.383588+00:00 | 2018-11-29 18:24:25.030175+00:00 | true         | true
       4 | cockroachdb-3.cockroachdb.default.svc.cluster.local:26257 | {{page.release_info.version}} | 2018-11-29 17:31:19.990784+00:00 | 2018-11-29 18:24:26.041686+00:00 | true         | true
    (4 rows)
    

    The pod uses the root client certificate created earlier to initialize the cluster, so there's no CSR approval required.

  2. Use the cockroach node decommission command to decommission the node with the highest number in its address, specifying its ID (in this example, 4):

    {{site.data.alerts.callout_info}} It's important to decommission the node with the highest number in its address because, when you reduce the replica count, Kubernetes will remove the pod for that node. {{site.data.alerts.end}}

    {% include copy-clipboard.html %}

    $ kubectl exec -it cockroachdb-client-secure \
    -- ./cockroach node decommission 4 \
    --certs-dir=/cockroach-certs \
    --host=cockroachdb-public

    You'll then see the decommissioning status print to stderr as it changes:

     id | is_live | replicas | is_decommissioning | is_draining  
    +---+---------+----------+--------------------+-------------+
      4 |  true   |       73 |        true        |    false     
    (1 row)
    

    Once the node has been fully decommissioned and stopped, you'll see a confirmation:

     id | is_live | replicas | is_decommissioning | is_draining  
    +---+---------+----------+--------------------+-------------+
      4 |  true   |        0 |        true        |    false     
    (1 row)
    
    No more data reported on target nodes. Please verify cluster health before removing the nodes.
    
  3. Once the node has been decommissioned, scale down your StatefulSet:

    {% include copy-clipboard.html %}

    $ kubectl scale statefulset cockroachdb --replicas=3
    statefulset.apps/cockroachdb scaled
    
  4. Verify that the pod was successfully removed:

    {% include copy-clipboard.html %}

    $ kubectl get pods
    NAME                        READY     STATUS    RESTARTS   AGE
    cockroachdb-0               1/1       Running   0          51m
    cockroachdb-1               1/1       Running   0          47m
    cockroachdb-2               1/1       Running   0          3m
    cockroachdb-client-secure   1/1       Running   0          15m
    ...
    
  5. You should also remove the persistent volume that was mounted to the pod. Get the persistent volume claims for the volumes:

    {% include copy-clipboard.html %}

    $ kubectl get pvc
    NAME                    STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
    datadir-cockroachdb-0   Bound    pvc-75dadd4c-01a1-11ea-b065-42010a8e00cb   100Gi      RWO            standard       17m
    datadir-cockroachdb-1   Bound    pvc-75e143ca-01a1-11ea-b065-42010a8e00cb   100Gi      RWO            standard       17m
    datadir-cockroachdb-2   Bound    pvc-75ef409a-01a1-11ea-b065-42010a8e00cb   100Gi      RWO            standard       17m
    datadir-cockroachdb-3   Bound    pvc-75e561ba-01a1-11ea-b065-42010a8e00cb   100Gi      RWO            standard       17m
    
  6. Verify that the PVC with the highest number in its name is no longer mounted to a pod:

    {% include copy-clipboard.html %}

    $ kubectl describe pvc datadir-cockroachdb-3
    Name:          datadir-cockroachdb-3
    ...
    Mounted By:    <none>
    
  7. Remove the persistent volume by deleting the PVC:

    {% include copy-clipboard.html %}

    $ kubectl delete pvc datadir-cockroachdb-3
    persistentvolumeclaim "datadir-cockroachdb-3" deleted