DQLite with a 2 node cluster #1407

samirsss · 2020-02-12T00:57:21Z

Version:

v1.17.2+k3s1

Describe the bug
We were trying to setup a 2 node cluster with DQLite. Seems like even if 1 node the k3s kubectl commands dont seem to work.

To Reproduce
Bring up a 2 node cluster and shut one of the nodes down. All k3s kubectl commands stop working. In our configuration both nodes are masters and going forward all nodes will be configured the same way (All are master + Worker nodes)

Expected behavior
Expectation is that even if 1 of the 2 nodes are down, k3s cluster should work and all kubectl commands work

Actual behavior
k3s kubectl commands dont work

Additional context

brandond · 2020-02-12T01:32:50Z

Dqlite uses the Raft consensus algorithm, under which a single node in a 2 node cluster does not have quorum. Consul also uses Raft so and their docs are good so I'll link this:
https://www.consul.io/docs/internals/consensus.html

Consensus is fault-tolerant up to the point where quorum is available. If a quorum of nodes is unavailable, it is impossible to process log entries or reason about peer membership. For example, suppose there are only 2 peers: A and B. The quorum size is also 2, meaning both nodes must agree to commit a log entry. If either A or B fails, it is now impossible to reach quorum. This means the cluster is unable to add or remove a node or to commit any additional log entries. This results in unavailability

You'll find similar guidance for any distributed multi-master system. You need to have an odd number of nodes, and a majority of them need to be online and participating in the cluster, in order to function.

samirsss · 2020-02-12T17:48:31Z

@brandond thanks for the response.

K3S HA documentation mentions that we can have HA setup with 2 or more nodes. What would be your recommendation on how to get HA with 2 nodes (both being master + worker nodes).

To provide some more context, the application that i'm working on can be a single node deployment (which is easy and done) or a 2 node deployment (which needs HA to work).

We have the options of using any possible option like DQLite, etcd or anything that is not too heavy on resources.

any help/recommendations here would really be appreciated.

brandond · 2020-02-12T18:35:33Z

The k3s docs say:

An HA K3s cluster is comprised of:
Two or more server nodes that will serve the Kubernetes API and run other control plane services
Zero or more agent nodes that are designated to run your apps and services
An external datastore (as opposed to the embedded SQLite datastore used in single-server setups)

Embedded dqlite for HA is still experimental, but given that it runs Raft, will always need an odd number of nodes for quorum.

If you want to run exactly 2 k3s nodes, using an external database (with it's own HA mechanism) is probably your best bet.

brandond · 2020-02-12T18:39:21Z

I will also note that you could use a lightweight 3rd node without an agent to act as nothing but a 3rd voting member in the dqlite cluster. Just deploy k3s server without the agent, or add NoSchedule taints to the node. Just because your app only wants 2 nodes doesn't mean your k3s cluster can't have more.

samirsss · 2020-02-12T19:20:08Z

Thanks @brandond - I think we're heading to the same conclusion of using an external DB like Postgres which has its on HA mechanism. Currently we were deploying postgres as a pod in our app, but looks like we'll have to externalize it and set it up in HA mode.

davidnuzik · 2020-02-18T16:13:46Z

Hello. What @brandond mentioned is correct #1407 (comment) dqlite uses the Raft consensus algorithm.
The documentation states the need for an odd number of nodes https://rancher.com/docs/k3s/latest/en/installation/ha-embedded/

May we close this issue or are there any remaining questions? Thanks!

samirsss · 2020-02-18T16:59:07Z

@brandond and @davidnuzik feel free to close this issue, since my question has been answered. Thanks again for quick responses.

Kampe · 2020-03-20T00:14:41Z

So interesting question then, can a three node cluster recover after being degraded to a 2 node cluster for some period of time? Or would there be issues with the quorum?

brandond · 2020-03-20T03:27:51Z

My understanding is that a 3 node cluster can function with only 2 nodes. Might be worth reading through the dqlite docs to better understand fail over behavior. K3s doesn't expose any of the dqlite logs or metrics either, which doesn't help.

yajo · 2020-03-26T08:04:15Z

If a 3-master cluster can't lose a master, how can that be considered HA? 🤔 It would be worse than having just 1 master, because now you have 3x chances for your cluster to go down...

Kampe · 2020-03-30T18:06:37Z

I'm under the same opinion, and am seeing this exact behavior wreck havoc on my cluster when attempting to use a HA 3 node multi master setup, especially if you tear one down.

brontide · 2020-04-26T15:02:55Z

Started with a functional 4 node, 3 master, cluster. Rebuilt the nodes one at a time to be careful to cordon and drain nodes as I went and boom when I took down the inital node.

root@red-2:~# k3s -version
k3s version v1.17.4+k3s1 (3eee8ac3)

Still has the multi-master issue. As soon as the node that started the cluster goes down you end up with the error that a leader cannot be found.

Error from server: rpc error: code = Unknown desc = failed to create dqlite connection: no available dqlite leader server found

It appears as though a mesh is not created when attaching new master nodes, master+n is always attempting to connect to the original master even if it goes away. So all the risks of single-master with more CPU usage. I'm really struggling to see the benefit, possibly just a documentation issue since you should not use the IP address of master1 when setting up new nodes, but a load balanced address.

root@red-2:~# systemctl cat k3s
# /etc/systemd/system/k3s.service
[Unit]
Description=Lightweight Kubernetes
Documentation=https://k3s.io
Wants=network-online.target

[Install]
WantedBy=multi-user.target

[Service]
Type=notify
EnvironmentFile=/etc/systemd/system/k3s.service.env
KillMode=process
Delegate=yes
LimitNOFILE=infinity
LimitNPROC=infinity
LimitCORE=infinity
TasksMax=infinity
TimeoutStartSec=0
Restart=always
RestartSec=5s
ExecStartPre=-/sbin/modprobe br_netfilter
ExecStartPre=-/sbin/modprobe overlay
ExecStart=/usr/local/bin/k3s \
    server \
        '--server' \
        'https://yellow-1.lan:6443' \

Trying to decide between single master or etcd at this point.

davidnuzik added [zube]: To Triage kind/question No code change, just asking/answering a question labels Feb 18, 2020

davidnuzik added this to the Backlog milestone Feb 18, 2020

davidnuzik closed this as completed Feb 18, 2020

zube bot added [zube]: Done and removed [zube]: To Triage labels Feb 18, 2020

wilmardo mentioned this issue Mar 19, 2020

HA dqlite seems to require the first server to be always UP #1391

Closed

davidnuzik added the [zube]: Unscheduled label May 1, 2020

zube bot reopened this May 1, 2020

zube bot removed the [zube]: Done label May 1, 2020

zube bot closed this as completed May 1, 2020

zube bot added [zube]: Done and removed [zube]: Unscheduled labels May 1, 2020

zube bot reopened this May 1, 2020

zube bot added [zube]: Unscheduled and removed [zube]: Done labels May 1, 2020

zube bot closed this as completed May 1, 2020

zube bot removed the [zube]: Unscheduled label May 1, 2020

zube bot added the [zube]: Done label May 1, 2020

zube bot reopened this May 1, 2020

zube bot added [zube]: Unscheduled and removed [zube]: Done labels May 1, 2020

zube bot closed this as completed May 1, 2020

zube bot added [zube]: Done and removed [zube]: Unscheduled labels May 1, 2020

davidnuzik added the [zube]: Unscheduled label May 1, 2020

zube bot removed the [zube]: Unscheduled label May 1, 2020

davidnuzik added the [zube]: Unscheduled label May 1, 2020

zube bot reopened this May 1, 2020

zube bot removed the [zube]: Done label May 1, 2020

davidnuzik closed this as completed May 1, 2020

zube bot added [zube]: Done and removed [zube]: Unscheduled labels May 1, 2020

davidnuzik removed the [zube]: Done label Sep 23, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DQLite with a 2 node cluster #1407

DQLite with a 2 node cluster #1407

samirsss commented Feb 12, 2020

brandond commented Feb 12, 2020

samirsss commented Feb 12, 2020

brandond commented Feb 12, 2020

brandond commented Feb 12, 2020

samirsss commented Feb 12, 2020

davidnuzik commented Feb 18, 2020

samirsss commented Feb 18, 2020

Kampe commented Mar 20, 2020

brandond commented Mar 20, 2020

yajo commented Mar 26, 2020

Kampe commented Mar 30, 2020

brontide commented Apr 26, 2020

DQLite with a 2 node cluster #1407

DQLite with a 2 node cluster #1407

Comments

samirsss commented Feb 12, 2020

brandond commented Feb 12, 2020

samirsss commented Feb 12, 2020

brandond commented Feb 12, 2020

brandond commented Feb 12, 2020

samirsss commented Feb 12, 2020

davidnuzik commented Feb 18, 2020

samirsss commented Feb 18, 2020

Kampe commented Mar 20, 2020

brandond commented Mar 20, 2020

yajo commented Mar 26, 2020

Kampe commented Mar 30, 2020

brontide commented Apr 26, 2020