Add validation test for cluster-reset with joining existing nodes back #6060

rancher-max · 2022-08-30T16:39:05Z

We currently have integration tests for cluster-reset with restore-path. We should also have a validation test for just a baseline cluster-reset command. This test should have at least 3 server nodes, and join the other two server nodes back to the cluster after running the cluster-reset command and initializing the new single-node cluster. The full steps, when done manually, are:

Create an HA cluster, make sure it's up and ready
Deploy some workloads.
Stop two server nodes by using "sudo k3s-killall.sh"
Make sure the cluster is not accessible now
Shut down the k3s server on that remaining node by running "sudo systemctl stop k3s"
Run "sudo k3s server --cluster-reset"
After the command completes, restart the k3s server process: "sudo systemctl start k3s"
Check to get it working again and after a while the other nodes will be in NotReady state. Usually have to wait a few minutes for this to occur.
Remove the db directories from other servers by running "sudo rm -rf /var/lib/rancher/k3s/server/db"
Restart the k3s server process on the other servers by running "sudo systemctl start k3s". It's best to do this one node at a time, otherwise an error will occur about "too many learner members in cluster". This is usually OK as it will reconcile itself, but doing one node at a time should avoid that altogether.
Run kubectl commands and deploy workloads on all nodes to validate everything is up and ready and all nodes are still part of the same cluster.

The text was updated successfully, but these errors were encountered:

VestigeJ · 2022-09-26T23:49:31Z

We need to append the process to perform the cluster-reset from secondary server nodes (ie any node that wasn't the target node on cluster creation).

rancher-max · 2022-10-10T16:25:05Z

I'm going to close this as the test has been added so there's nothing to officially test here. See linked issue above that will be worked to address comments here.

rancher-max added kind/task Work not related to bug fixes or new functionality kind/test labels Aug 30, 2022

cwayne18 added this to the v1.24.5+k3s1 milestone Aug 30, 2022

cwayne18 mentioned this issue Aug 31, 2022

[Epic] Integration/e2e test additions #6070

Closed

11 tasks

brooksn self-assigned this Sep 1, 2022

cwayne18 mentioned this issue Sep 19, 2022

Create test to validate cluster-reset in k3s #6156

Closed

brooksn removed their assignment Sep 20, 2022

ShylajaDevadiga self-assigned this Sep 20, 2022

ShylajaDevadiga mentioned this issue Sep 20, 2022

Add cluster reset test #6161

Merged

rancher-max mentioned this issue Oct 10, 2022

Update cluster-reset e2e test to also run from non-bootstrap node #6252

Closed

rancher-max closed this as completed Oct 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add validation test for cluster-reset with joining existing nodes back #6060

Add validation test for cluster-reset with joining existing nodes back #6060

rancher-max commented Aug 30, 2022

VestigeJ commented Sep 26, 2022

rancher-max commented Oct 10, 2022

Add validation test for cluster-reset with joining existing nodes back #6060

Add validation test for cluster-reset with joining existing nodes back #6060

Comments

rancher-max commented Aug 30, 2022

VestigeJ commented Sep 26, 2022

rancher-max commented Oct 10, 2022