Add embedded etcd support #1770

ibuildthecloud · 2020-05-07T05:18:32Z

This PR will swap dqlite for etcd for the embedded HA option. sqlite is the default storage option. The UX for using etcd is the exact same as dqlite. This means to enable etcd you must run one server with server --cluster-init and then join other servers with server -s URL -t token.

This PR needs a lot more testing and some bumpy edges rounded out still.

leolb-aphp · 2020-05-18T19:44:27Z

This looks good, I feel like this is a more dependable solution in the context of Kubernetes because the interaction between etcd and Kubernetes is also tested upstream, unlike dqlite :-)

aarononeal · 2020-05-18T21:32:10Z

What's the migration path to convert a dqlite to an etcd cluster?

brandond · 2020-05-18T21:46:12Z

@aarononeal there are not currently any tools to perform datastore migration. You're basically rebuilding the cluster and redeploying your manifests.

leolb-aphp · 2020-05-19T09:12:08Z

@aarononeal I don't think anyone was running a dqlite cluster for anything serious anyway, it didnt work so well and didnt give high availability due to the implementation being experimental and having some shortcomings :)

remkolems · 2020-05-19T15:20:46Z

@aarononeal I don't think anyone was running a dqlite cluster for anything serious anyway, it didnt work so well and didnt give high availability due to the implementation being experimental and having some shortcomings :)

Well I was about to... I know, can't wait forever... and that along with the missing migration part... so just in time.

Is there a (rough) timeline available for the embedded etcd support? If I can assist in testing, count me in.

aarononeal · 2020-05-19T19:55:58Z

Dqlite aside, is there a K3S migration path from single master to HA cluster and back or does that kind of move also require rebuilding the cluster?

I could imagine folks starting with a single master and wanting to move to HA later without such pains.

Could have also made for a good migration path here because then I could have dumped dqlite back to sqlite before moving forward to etcd again.

brandond · 2020-05-19T21:11:30Z

@aarononeal you can go from single master to multi-master and back again with any external datastore. Using the built-in sqlite datastore limits you to only a single master. The biggest limitation is just that there's no good way to migrate between datastores.

aarononeal · 2020-05-20T00:51:43Z

Since I have to move from embedded to external, I was considering using lxd and juju as a fast way to deploy etcd.

Is that a bad idea given lxd clustering relies on dqlite? I'm not up to speed on the problems.

Eventually I would prefer to stick with the embedded k3s option to avoid the added lxd and operational dependencies.

brandond · 2020-05-20T06:24:43Z

dqlite comes from lxd, but it appears that they don't have the same issues with master transitions as kine. I haven't played with juju, I just run etcd as a systemd unit on each of my nodes alongside k3s.

wilmardo · 2020-05-26T08:09:00Z

Since #1760 is closed I am including @brandond his concerns here

My biggest surprise when I switched from dqlite to etcd is how demanding etcd is of low-latency disk. For my tiny 3-node cluster that's basically doing nothing (7 etcd events/sec) etcd issues 40 write ops/sec and 15 fsync/sec. It also expects these to reliably execute in less than 10ms, or nodes start getting evicted from the cluster due to election timeouts. 7200 RPM rotational disks could not reliably deliver this; I had to put the etcd database and journal on ramdisk until I was able to get some additional SSD to throw at it.
#1760 (comment)

This could be a major turnoff for people running SD cards or eMMC flash since these notorious for failing due to excess write operations. A parameter to run ETCD in memory might be an option for devices with 4GB of RAM?
Also referencing this discussion over at kubernetes-sigs/kind#845

brandond · 2020-05-26T08:49:29Z

Yeah, etcd is crazy demanding of low IO latency, and issues about 14 fsyncs/second on my 3 node cluster with basically nothing going on. Definitely not as easy to run on low-end hardware as kine with sqlite.

I ran ercd for a while on tmpfs while waiting for some more SSDs to come in. It worked OK but took a lot of memory since it doesn't purge the wal very agressively.

brandond · 2020-05-27T22:35:32Z

Possibly relevant recent change to etcd:
etcd-io/etcd#11946

This is replaces dqlite with etcd. The each same UX of dqlite is followed so there is no change to the CLI args for this.

ibuildthecloud · 2020-06-06T23:46:11Z

We are going to go ahead with merging this PR but there still a lot of testing to do, handling upgrade, improving cluster bootstrap procedure. We are very concerned about the I/O demands of etcd and will continue to look at that. Right now dqlite is broken (due to our integration, not dqlite itself) and not really usable, so moving to etcd at least gets us to a functioning embedded option. sqlite or mysql is still a much better option for less resource usage.

This work is targeted to be included in k3s 1.19 (which should be released first half of August according to the k8s 1.19 release schedule).

warmchang · 2020-06-07T05:25:14Z

pkg/etcd/etcd.go

@@ -389,6 +389,8 @@ func (e *ETCD) cluster(ctx context.Context, forceNew bool, options executor.Init
 			ClientCertAuth: true,
 			TrustedCAFile:  e.config.Runtime.ETCDPeerCA,
 		},
+		ElectionTimeout:   5000,


adjuste to customizable parameters?

I'll address the customization of etcd in a follow up PR. We will probably need to do an approach where you can specific the etcd conf file and we merge it, similar to containerd config. There's to many params to add to the CLI to address everything.

go.mod

srdjan · 2020-06-07T14:59:43Z

just wondering... was rqlite ever considered ?

ibuildthecloud · 2020-06-07T17:41:23Z

@srdjan rqlite was considered but we preferred dqlite at the time. The exact reasoning I don't remember. At this point it doesn't really matter because the key take away is that it's infeasible for the core k3s team to maintain alternative raft based systems. We are switching to etcd primarily for the fact it's well tested and known. Not really because it's technically superior or even liked. 😀

remkolems · 2020-06-12T14:11:40Z

We are going to go ahead with merging this PR but there still a lot of testing to do, handling upgrade, improving cluster bootstrap procedure. We are very concerned about the I/O demands of etcd and will continue to look at that. Right now dqlite is broken (due to our integration, not dqlite itself) and not really usable, so moving to etcd at least gets us to a functioning embedded option. sqlite or mysql is still a much better option for less resource usage.

This work is targeted to be included in k3s 1.19 (which should be released first half of August according to the k8s 1.19 release schedule).

Thank you for the timeline!

Quick question (don't know if this is an obvious one or even a rhetoric one, but):

For the time being, why can't we use an etcd docker/container - that forms a dedicated etcd cluster - on each of the k3s master nodes which than kind of mimic/resemble an "embedded" etcd support?

In my reasoning we don't have to buy/administer/etcetera dedicated separate additional hardware to run an etcd cluster.

nustiueudinastea · 2020-06-12T14:35:19Z

@remkolems I think having a separate etcd cluster, even if running on the same nodes as the K8s master, would introduce considerable operation complexity which goes against the goals of k3s. I think embedding etcd in k3s is a great idea, as long as etcd can be tuned to work on lower powered devices that don't have the fastest storage.

stevefan1999-personal · 2020-08-25T10:04:01Z

This is going to be a kubeadm killer...yet you will also have to pay the price of getting a bigger binary...

jamesorlakin · 2020-08-25T11:57:05Z

We'll have to wait and see what the binary impact is, but I imagine it won't be a deal breaker in reality. Many container images are fairly sizeable in their own right.

At a pinch, I imagine there might be a way to tweak the builds scripts to exclude it. (Go build tags maybe?)

cawoodm · 2022-04-10T09:59:27Z

Why was dqlite dropped in k3s? microk8s is still "happily" using it and claiming HA.

onedr0p · 2022-04-10T10:34:57Z

dqlite and microk8s are both made and maintained by Canonical, it makes sense they want to dog food it. Besides that when k3s was using dqlite it was very unstable for me, what's wrong with using plain ol' reliable etcd?

stevefan1999-personal · 2022-04-10T12:36:30Z

@cawoodm dqlite support, or actually a simple wrapper of dqlite for the Kine sqlite backend, is pretty shitty to say the least. It does not handle most of the etcd operations, no user support, no proper mvvc support and only supports essential sync based CRUD operations. The database support is real bad for a real cluster. No wonder it was switched away from dqlite and use embedded etcd instead. Considering this would you still use microk8s as your main cluster?

Spoiler alert: I worked on a PR for switching Kine to use a GORM backend. And I found out the egregious details. By the way, one can actually easily implement etcd because it is gRPC-based. I have a simple Rust implementation of Kine that I'm hoping to replace the current shit we have today.

cawoodm · 2022-04-10T12:58:12Z

Adrian Goins mentioned in his HA video that dqlite was not reliable but no details. Then he proceeds to create a HA K3S cluster but it's a very complex process. We're trying to decide whether K3S is the way to go for HA on-prem Kubernetes. Seemingly better support by Canonical had us leaning towards microk8s until we heard vague doubts about dqlite.

onedr0p · 2022-04-10T13:11:19Z

The process is really not hard to use k3s and etcd, in fact etcd is the default now in k3s so there's no special configuration needed unless you need to tweak etcd settings.

aarononeal · 2022-04-10T16:13:47Z

I too found dqlite incredibly unstable. Botched syncing took down 4 different clusters during node reboots. It wasn't stable for production.

brandond · 2022-04-10T17:49:56Z

Start one node with --cluster-init. Start more nodes (either server or agent) with --server pointed at the first node. Not sure how that's complex?

cawoodm · 2022-04-10T19:27:16Z

The process is really not hard to use k3s and etcd, in fact etcd is the default now in k3s so there's no special configuration needed unless you need to tweak etcd settings.

I'm trying to setup 2-node HA with external postgres. Several days of trying have not been successful.
#5406

onedr0p · 2022-04-10T19:57:11Z

That doesn't have anything to do with etcd or dqlite ~~nor you are giving enough information to help debug the issue. I suggest you open an issue with more information~~

brandond · 2022-04-10T20:06:58Z

Yeah I'm not sure what your problem has to do with etcd since you're not using etcd. I'm going to lock this conversation; anyone who is having problems (with etcd or otherwise) should open an issue instead of commenting on this PR.

ibuildthecloud mentioned this pull request May 7, 2020

WIP: Delete dqlite #1760

Closed

ibuildthecloud force-pushed the etcd branch 4 times, most recently from 4c30710 to d560cca Compare May 8, 2020 17:24

cubic3d mentioned this pull request May 18, 2020

HA dqlite seems to require the first server to be always UP #1391

Closed

jamesorlakin mentioned this pull request May 18, 2020

io: load closed segment 0000000000064324-0000000000064793: found 469 entries (expected 470) #1639

Closed

iwilltry42 mentioned this pull request Jun 4, 2020

Cluster not working after reboot k3d-io/k3d#262

Closed

ibuildthecloud marked this pull request as ready for review June 6, 2020 23:28

ibuildthecloud and others added 6 commits June 6, 2020 16:39

Make program name a variable to be changed at compile time

7e59c08

Delete dqlite

4317a91

Refactor clustered DB framework

a18d387

Generate etcd certificates

3957142

Add embedded etcd support

6b5b693

This is replaces dqlite with etcd. The each same UX of dqlite is followed so there is no change to the CLI args for this.

Add heartbeat interval and election timeout

c580a8b

ibuildthecloud force-pushed the etcd branch from 87bf3b6 to fee8c5b Compare June 6, 2020 23:40

warmchang reviewed Jun 7, 2020

View reviewed changes

go.mod Outdated Show resolved Hide resolved

Update vendor

f4ff2bf

ibuildthecloud force-pushed the etcd branch from fee8c5b to f4ff2bf Compare June 7, 2020 05:38

ibuildthecloud merged commit fe73379 into k3s-io:master Jun 7, 2020

iwilltry42 mentioned this pull request Jun 18, 2020

[FEATURE] Support embeded etcd k3d-io/k3d#286

Closed

FlexibleToast mentioned this pull request Jul 29, 2020

Operator shows some warnings on ARM (k3s) deployments kadalu/kadalu#276

Closed

davidnuzik mentioned this pull request Aug 13, 2020

[Testing] QA validate embedded etcd #2079

Closed

This was referenced Aug 29, 2020

Add Gorm v2-based backend k3s-io/kine#46

Closed

Create a tool for kine <-> etcd migration #2190

Closed

davidnuzik mentioned this pull request Sep 10, 2020

[Documentation] experimental etcd snapshot/restore procedure #2224

Closed

spowelljr mentioned this pull request Jul 19, 2021

time-to-k8s: Benchmark k3d with etcd kubernetes/minikube#12014

Closed

k3s-io locked as off-topic and limited conversation to collaborators Apr 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add embedded etcd support #1770

Add embedded etcd support #1770

ibuildthecloud commented May 7, 2020

leolb-aphp commented May 18, 2020

aarononeal commented May 18, 2020

brandond commented May 18, 2020

leolb-aphp commented May 19, 2020

remkolems commented May 19, 2020

aarononeal commented May 19, 2020

brandond commented May 19, 2020 •

edited

Loading

aarononeal commented May 20, 2020

brandond commented May 20, 2020

wilmardo commented May 26, 2020

brandond commented May 26, 2020 •

edited

Loading

brandond commented May 27, 2020

ibuildthecloud commented Jun 6, 2020

warmchang Jun 7, 2020

ibuildthecloud Jun 7, 2020

srdjan commented Jun 7, 2020

ibuildthecloud commented Jun 7, 2020

remkolems commented Jun 12, 2020

nustiueudinastea commented Jun 12, 2020

stevefan1999-personal commented Aug 25, 2020

jamesorlakin commented Aug 25, 2020

cawoodm commented Apr 10, 2022

onedr0p commented Apr 10, 2022 •

edited

Loading

stevefan1999-personal commented Apr 10, 2022

cawoodm commented Apr 10, 2022

onedr0p commented Apr 10, 2022

aarononeal commented Apr 10, 2022

brandond commented Apr 10, 2022

cawoodm commented Apr 10, 2022

onedr0p commented Apr 10, 2022 •

edited

Loading

brandond commented Apr 10, 2022

Add embedded etcd support #1770

Add embedded etcd support #1770

Conversation

ibuildthecloud commented May 7, 2020

leolb-aphp commented May 18, 2020

aarononeal commented May 18, 2020

brandond commented May 18, 2020

leolb-aphp commented May 19, 2020

remkolems commented May 19, 2020

aarononeal commented May 19, 2020

brandond commented May 19, 2020 • edited Loading

aarononeal commented May 20, 2020

brandond commented May 20, 2020

wilmardo commented May 26, 2020

brandond commented May 26, 2020 • edited Loading

brandond commented May 27, 2020

ibuildthecloud commented Jun 6, 2020

warmchang Jun 7, 2020

Choose a reason for hiding this comment

ibuildthecloud Jun 7, 2020

Choose a reason for hiding this comment

srdjan commented Jun 7, 2020

ibuildthecloud commented Jun 7, 2020

remkolems commented Jun 12, 2020

nustiueudinastea commented Jun 12, 2020

stevefan1999-personal commented Aug 25, 2020

jamesorlakin commented Aug 25, 2020

cawoodm commented Apr 10, 2022

onedr0p commented Apr 10, 2022 • edited Loading

stevefan1999-personal commented Apr 10, 2022

cawoodm commented Apr 10, 2022

onedr0p commented Apr 10, 2022

aarononeal commented Apr 10, 2022

brandond commented Apr 10, 2022

cawoodm commented Apr 10, 2022

onedr0p commented Apr 10, 2022 • edited Loading

brandond commented Apr 10, 2022

brandond commented May 19, 2020 •

edited

Loading

brandond commented May 26, 2020 •

edited

Loading

onedr0p commented Apr 10, 2022 •

edited

Loading

onedr0p commented Apr 10, 2022 •

edited

Loading