Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DQLite with a 2 node cluster #1407

Closed
samirsss opened this issue Feb 12, 2020 · 12 comments
Closed

DQLite with a 2 node cluster #1407

samirsss opened this issue Feb 12, 2020 · 12 comments
Labels
kind/question No code change, just asking/answering a question
Milestone

Comments

@samirsss
Copy link

Version:

v1.17.2+k3s1

Describe the bug
We were trying to setup a 2 node cluster with DQLite. Seems like even if 1 node the k3s kubectl commands dont seem to work.

To Reproduce
Bring up a 2 node cluster and shut one of the nodes down. All k3s kubectl commands stop working. In our configuration both nodes are masters and going forward all nodes will be configured the same way (All are master + Worker nodes)

Expected behavior
Expectation is that even if 1 of the 2 nodes are down, k3s cluster should work and all kubectl commands work

Actual behavior
k3s kubectl commands dont work

Additional context

@brandond
Copy link
Member

Dqlite uses the Raft consensus algorithm, under which a single node in a 2 node cluster does not have quorum. Consul also uses Raft so and their docs are good so I'll link this:
https://www.consul.io/docs/internals/consensus.html

Consensus is fault-tolerant up to the point where quorum is available. If a quorum of nodes is unavailable, it is impossible to process log entries or reason about peer membership. For example, suppose there are only 2 peers: A and B. The quorum size is also 2, meaning both nodes must agree to commit a log entry. If either A or B fails, it is now impossible to reach quorum. This means the cluster is unable to add or remove a node or to commit any additional log entries. This results in unavailability

You'll find similar guidance for any distributed multi-master system. You need to have an odd number of nodes, and a majority of them need to be online and participating in the cluster, in order to function.

@samirsss
Copy link
Author

@brandond thanks for the response.

K3S HA documentation mentions that we can have HA setup with 2 or more nodes. What would be your recommendation on how to get HA with 2 nodes (both being master + worker nodes).

To provide some more context, the application that i'm working on can be a single node deployment (which is easy and done) or a 2 node deployment (which needs HA to work).

We have the options of using any possible option like DQLite, etcd or anything that is not too heavy on resources.

any help/recommendations here would really be appreciated.

@brandond
Copy link
Member

The k3s docs say:

An HA K3s cluster is comprised of:
Two or more server nodes that will serve the Kubernetes API and run other control plane services
Zero or more agent nodes that are designated to run your apps and services
An external datastore (as opposed to the embedded SQLite datastore used in single-server setups)

Embedded dqlite for HA is still experimental, but given that it runs Raft, will always need an odd number of nodes for quorum.

If you want to run exactly 2 k3s nodes, using an external database (with it's own HA mechanism) is probably your best bet.

@brandond
Copy link
Member

I will also note that you could use a lightweight 3rd node without an agent to act as nothing but a 3rd voting member in the dqlite cluster. Just deploy k3s server without the agent, or add NoSchedule taints to the node. Just because your app only wants 2 nodes doesn't mean your k3s cluster can't have more.

@samirsss
Copy link
Author

Thanks @brandond - I think we're heading to the same conclusion of using an external DB like Postgres which has its on HA mechanism. Currently we were deploying postgres as a pod in our app, but looks like we'll have to externalize it and set it up in HA mode.

@davidnuzik
Copy link
Contributor

Hello. What @brandond mentioned is correct #1407 (comment) dqlite uses the Raft consensus algorithm.
The documentation states the need for an odd number of nodes https://rancher.com/docs/k3s/latest/en/installation/ha-embedded/

May we close this issue or are there any remaining questions? Thanks!

@davidnuzik davidnuzik added [zube]: To Triage kind/question No code change, just asking/answering a question labels Feb 18, 2020
@davidnuzik davidnuzik added this to the Backlog milestone Feb 18, 2020
@samirsss
Copy link
Author

@brandond and @davidnuzik feel free to close this issue, since my question has been answered. Thanks again for quick responses.

@Kampe
Copy link

Kampe commented Mar 20, 2020

So interesting question then, can a three node cluster recover after being degraded to a 2 node cluster for some period of time? Or would there be issues with the quorum?

@brandond
Copy link
Member

My understanding is that a 3 node cluster can function with only 2 nodes. Might be worth reading through the dqlite docs to better understand fail over behavior. K3s doesn't expose any of the dqlite logs or metrics either, which doesn't help.

@yajo
Copy link

yajo commented Mar 26, 2020

If a 3-master cluster can't lose a master, how can that be considered HA? 🤔 It would be worse than having just 1 master, because now you have 3x chances for your cluster to go down...

@Kampe
Copy link

Kampe commented Mar 30, 2020

I'm under the same opinion, and am seeing this exact behavior wreck havoc on my cluster when attempting to use a HA 3 node multi master setup, especially if you tear one down.

@brontide
Copy link

Started with a functional 4 node, 3 master, cluster. Rebuilt the nodes one at a time to be careful to cordon and drain nodes as I went and boom when I took down the inital node.

root@red-2:~# k3s -version
k3s version v1.17.4+k3s1 (3eee8ac3)

Still has the multi-master issue. As soon as the node that started the cluster goes down you end up with the error that a leader cannot be found.

Error from server: rpc error: code = Unknown desc = failed to create dqlite connection: no available dqlite leader server found

It appears as though a mesh is not created when attaching new master nodes, master+n is always attempting to connect to the original master even if it goes away. So all the risks of single-master with more CPU usage. I'm really struggling to see the benefit, possibly just a documentation issue since you should not use the IP address of master1 when setting up new nodes, but a load balanced address.

root@red-2:~# systemctl cat k3s
# /etc/systemd/system/k3s.service
[Unit]
Description=Lightweight Kubernetes
Documentation=https://k3s.io
Wants=network-online.target

[Install]
WantedBy=multi-user.target

[Service]
Type=notify
EnvironmentFile=/etc/systemd/system/k3s.service.env
KillMode=process
Delegate=yes
LimitNOFILE=infinity
LimitNPROC=infinity
LimitCORE=infinity
TasksMax=infinity
TimeoutStartSec=0
Restart=always
RestartSec=5s
ExecStartPre=-/sbin/modprobe br_netfilter
ExecStartPre=-/sbin/modprobe overlay
ExecStart=/usr/local/bin/k3s \
    server \
        '--server' \
        'https://yellow-1.lan:6443' \

Trying to decide between single master or etcd at this point.

@zube zube bot reopened this May 1, 2020
@zube zube bot removed the [zube]: Done label May 1, 2020
@zube zube bot closed this as completed May 1, 2020
@zube zube bot reopened this May 1, 2020
@zube zube bot closed this as completed May 1, 2020
@zube zube bot removed the [zube]: Unscheduled label May 1, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/question No code change, just asking/answering a question
Projects
None yet
Development

No branches or pull requests

6 participants