You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We currently have integration tests for cluster-reset with restore-path. We should also have a validation test for just a baseline cluster-reset command. This test should have at least 3 server nodes, and join the other two server nodes back to the cluster after running the cluster-reset command and initializing the new single-node cluster. The full steps, when done manually, are:
Create an HA cluster, make sure it's up and ready
Deploy some workloads.
Stop two server nodes by using "sudo k3s-killall.sh"
Make sure the cluster is not accessible now
Shut down the k3s server on that remaining node by running "sudo systemctl stop k3s"
Run "sudo k3s server --cluster-reset"
After the command completes, restart the k3s server process: "sudo systemctl start k3s"
Check to get it working again and after a while the other nodes will be in NotReady state. Usually have to wait a few minutes for this to occur.
Remove the db directories from other servers by running "sudo rm -rf /var/lib/rancher/k3s/server/db"
Restart the k3s server process on the other servers by running "sudo systemctl start k3s". It's best to do this one node at a time, otherwise an error will occur about "too many learner members in cluster". This is usually OK as it will reconcile itself, but doing one node at a time should avoid that altogether.
Run kubectl commands and deploy workloads on all nodes to validate everything is up and ready and all nodes are still part of the same cluster.
The text was updated successfully, but these errors were encountered:
I'm going to close this as the test has been added so there's nothing to officially test here. See linked issue above that will be worked to address comments here.
We currently have integration tests for cluster-reset with restore-path. We should also have a validation test for just a baseline
cluster-reset
command. This test should have at least 3 server nodes, and join the other two server nodes back to the cluster after running the cluster-reset command and initializing the new single-node cluster. The full steps, when done manually, are:The text was updated successfully, but these errors were encountered: