Use memory storage for etcd #845

aojea · 2019-09-07T09:19:56Z

What would you like to be added:

Configure etcd storage in memory to improve the performance

Why is this needed:

etcd causes a very high disk io, and this can cause performance issues, especially if there are several kind clusters running in the same system, because you end with a lot of process writing to disk causing latency and affecting the other applications using the same disk,

Since #779 , the var filesystems was no longer running on the container filesystem, improving the performance, however, the etcd storage continues to be on the disk, as we can see in the pod manifest:

  etcd-data:
    Type:          HostPath (bare host directory volume)
    Path:          /var/lib/etcd
    HostPathType:  DirectoryOrCreate

Ideally, we should have /var/lib/etcd/in memory, since the clusters are created to be created and destroyed and the information shouldn't be persistent.

I have doubts about the best approach:

Should be this modified in kind creating a new tmpfs volume for etcd?
Can this be modified in kubeadm so we can mount the etcd-data in memory or in another location of the node that's in memory?
...

** NOTES **

etcd io accumulated iotop -a

26206 be/4 root          0.00 B    192.00 K  0.00 %  1.04 % etcd --advertise-client-urls=htt~e=/etc/kubernetes/pki/etcd/ca.crt
26196 be/4 root          0.00 B    224.00 K  0.00 %  0.98 % etcd --advertise-client-urls=htt~e=/etc/kubernetes/pki/etcd/ca.crt
26288 be/4 root          0.00 B    216.00 K  0.00 %  0.94 % etcd --advertise-client-urls=htt~e=/etc/kubernetes/pki/etcd/ca.crt
26249 be/4 root          0.00 B    180.00 K  0.00 %  0.88 % etcd --advertise-client-urls=htt~e=/etc/kubernetes/pki/etcd/ca.crt
26266 be/4 root          0.00 B     52.00 K  0.00 %  0.47 % etcd --advertise-client-urls=htt~e=/etc/kubernetes/pki/etcd/ca.crt
26187 be/4 root          0.00 B     52.00 K  0.00 %  0.42 % etcd --advertise-client-urls=htt~e=/etc/kubernetes/pki/etcd/ca.crt
26267 be/4 root          0.00 B     48.00 K  0.00 %  0.37 % etcd --advertise-client-urls=htt~e=/etc/kubernetes/pki/etcd/ca.crt
26192 be/4 root          0.00 B     60.00 K  0.00 %  0.36 % etcd --advertise-client-urls=htt~e=/etc/kubernetes/pki/etcd/ca.crt
26263 be/4 root          0.00 B     52.00 K  0.00 %  0.31 % etcd --advertise-client-urls=htt~e=/etc/kubernetes/pki/etcd/ca.crt
26261 be/4 root          0.00 B     64.00 K  0.00 %  0.28 % etcd --advertise-client-urls=htt~e=/etc/kubernetes/pki/etcd/ca.crt
19155 be/4 root          0.00 B      0.00 B  0.00 %  0.19 % [kworker/1:2]
26286 be/4 root          0.00 B     28.00 K  0.00 %  0.18 % etcd --advertise-client-urls=htt~e=/etc/kubernetes/pki/etcd/ca.crt
26289 be/4 root          0.00 B     32.00 K  0.00 %  0.16 % etcd --advertise-client-urls=htt~e=/etc/kubernetes/pki/etcd/ca.crt
  578 be/4 root          0.00 B      2.00 M  0.00 %  0.16 % [btrfs-transacti]
26268 be/4 root          0.00 B     28.00 K  0.00 %  0.11 % etcd --advertise-client-urls=htt~e=/etc/kubernetes/pki/etcd/ca.crt

The text was updated successfully, but these errors were encountered:

aojea · 2019-09-07T09:20:19Z

/cc @BenTheElder @neolit123

neolit123 · 2019-09-07T11:39:56Z

Can this be modified in kubeadm so we can mount the etcd-data in memory or in another location of the node that's in memory?

kubeadm passes --data-dir=/var/lib/etcd to etcd and mounts this directory using hostPath.
we can just try:

        emptyDir:
          medium: Memory

but this means kubeadm init / join commands need to:

use phases to skip / customize the "manifests" phase
or
deploy etcd, patch manifest, restart static pod

etcd causes a very high disk io, and this can cause performance issues, especially if there are several kind clusters running in the same system, because you end with a lot of process writing to disk causing latency and affecting the other applications using the same disk,

k/k master just moved to 3.3.15, while 1.15 uses an older version.
is this a regression? and IDLE cluster should not have high disk i/o.

if this disk i/o suddenly became a problem this should be in a k/k issue.

BenTheElder · 2019-09-07T16:40:35Z

Etcd is going to be writing all the constantly updated objects, no? (Eg node status)

It would be trivial to test kind with memory backed etcd by adjusting node creation, but I don't think you'd ever run a real cluster not on disk... 🤔

aojea · 2019-09-07T16:48:42Z

Etcd is going to be writing all the constantly updated objects, no? (Eg node status)

yeah, data need to persist to disk to provide consistency

It would be trivial to test kind with memory backed etcd by adjusting node creation, but I don't think you'd ever run a real cluster not on disk... 🤔

Absolutely, real clusters must use disks, this is only meant to be used for testing, my rationale is that these k8s cluster are ephemeral, thus the etcd clusters don't need to "persist" data on disk

Can this be patched with the kind config? It will be enough with passing a different folder than --data-dir=/var/lib/etcd

BenTheElder · 2019-09-07T16:55:32Z

You can test this more or less with no changes by making a tmpfs on the host and configuring it to mount there on a control plane.

You could also edit the kind control plane creation process to put a tmpfs here on the node

We should experiment, but I think we do eventually want durable etcd for certain classes of testing..

BenTheElder · 2019-09-07T22:08:07Z

Also worth pointing out:

our CI is backed by SSD
I'm not aware of any other cluster implementation not backing etcd with disk, including eg hack/local-up-cluster

aojea · 2019-09-07T22:19:32Z

yeah, for k8s CI is not a big problem, but for users that run kind locally, it is. It took me a while to understand what was slowing down my system until I've found that my kind clusters were causing big latency in one of my disks.
I just want to test and document the differences :)

aojea · 2019-09-07T23:13:36Z

ok, here is how to run etcd using memory storage for reference

Create the memory storaga

sudo mkdir /tmp/etcd
sudo mount -t tmpfs  /tmp/etcd

Mount it on the control nodes

kind: Cluster
apiVersion: kind.sigs.k8s.io/v1alpha3
nodes:
- role: control-plane
  extraMounts:
  - containerPath: /var/lib/etcd
    hostPath: /tmp/etcd

aojea · 2019-10-04T15:27:29Z

/reopen per conversation in slack https://kubernetes.slack.com/archives/CEKK1KTN2/p1570202642295000?thread_ts=1570196798.288800&cid=CEKK1KTN2

I'd like to find a way to make this easier to configure, mainly for people that want to use kind in their laptops and not in CIs, etcd writing constantly to disk directly is no adding any benefit in this particular scenario

aojea · 2019-10-07T13:34:57Z

You could also edit the kind control plane creation process to put a tmpfs here on the node

I think this will work

We should experiment, but I think we do eventually want durable etcd for certain classes of testing..

I was thinking more about this, and can't see the "durability" difference between using a folder inside the container or using a tmpfs volume for the etcd data dir, the data will be available as long as the container is alive, no?

However, etcd writing to a tmpfs volume will be a big performance improvement, at a cost of less memory available, of course

home/aojeagarcia/docker/volumes/5d2d2cab7dcb7c93b9a8a5f8591462caf4fbca5c332e663aa4628702b3d2dc50/_data/lib/etcd/member # du -sh *
1.5M    snap
245M    wal

neolit123 · 2019-10-07T14:11:36Z

However, etcd writing to a tmpfs volume will be a big performance improvement, at a cost of less memory available, of course

i'd be interested if this will prevent me from testing 3 CP setups with kind on my setup.
it doesn't have RAM for 4 CPs :)

BenTheElder · 2019-10-07T14:25:31Z

I was thinking more about this, and can't see the "durability" difference between using a folder inside the container or using a tmpfs volume for the etcd data dir, the data will be available as long as the container is alive, no?

It's NOT a folder inside the container, it's on a volume.

When we fix kind to survive host reboots (and we will) then this will break it again.

It also will consume more RAM of course.

aojea · 2019-10-07T14:47:27Z

It's NOT a folder inside the container, it's on a volume.

https://github.com/kubernetes-sigs/kind/blob/master/pkg/internal/cluster/providers/docker/provision.go#L164-L169

I see it now 🤦‍♂️

aojea · 2019-10-15T10:44:00Z

can this be causing timeouts in the CI with slow disks?

BenTheElder · 2019-10-15T14:51:34Z

#928 (comment)

^^ possibly for istio, doesn't look like Kubernetes CI is seeing timeouts at this point. That's not the pattern with the broken pipe.

Even for istio, I doubt it's "because they aren't doing this" but it could be "because they are otherwise using too much IOPs for the allocated disks" IIRC they are also on GCP PD-SSD which is quite fast.

BenTheElder · 2019-11-06T00:41:34Z

for CI I think the better pattern I want to try is to use a pool of PDs from some storage class to replace the emptyDir.

I've been mulling how we could do this and persist some of the images in a clean and sane way, but imo this is well out of scope for the kind project.

aojea · 2019-11-06T02:22:36Z

for CI I think the better pattern I want to try is to use a pool of PDs from some storage class to replace the emptyDir.

I've been mulling how we could do this and persist some of the images in a clean and sane way, but imo this is well out of scope for the kind project.

I think that this is only an issue for people using kind in their laptops or workstations, totally agree with you on the CI use case

fejta-bot · 2020-02-04T02:34:09Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

fejta-bot · 2020-03-05T03:16:57Z

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

BenTheElder · 2020-03-05T04:31:09Z

did we wind up testing this in CI?

aojea · 2020-05-29T20:22:36Z

The bootstrapping process with kubeadm is suspiciously long

If you are not afraid of security, and if is possible in kubeadm ( I really don't know) avoid the certificate generation ... Maybe is possible to include some well known certificate

warmchang · 2020-06-05T01:58:06Z

There is an unsafe "--unsafe-no-fsync" flag added in etcd to disables fsync.

FYI: etcd-io/etcd#11946

BenTheElder · 2020-06-06T12:59:50Z

Yeah, we're very interested in that once it's available in kubeadm's etcd.

Currently we encounter bad performance of KIND cluster on DinD setup, we get 'etcdserver: timeout errors' that causes jobs to fail often. In such cases it is recommanded [1] to use in-memory etcd Running etcd in memory should improve performance and will make sriov provider more stabilized. [1] kubernetes-sigs/kind#845 Signed-off-by: Or Mergi <[email protected]>

BenTheElder · 2021-01-19T23:40:04Z

Circling back because this came up again today: I experimented with tempfs + the unsafe no fsync flag late last year and didn't see measurable improvements on my hardware (couple different dev machines), YMMV, this still doesn't seem to be a clear win even when persistence is not interesting, it depends on the usage and hardware.

aojea · 2021-01-20T08:03:13Z

for CIs like github actions there is a measurable difference when running the e2e test suite :)

dprotaso · 2022-01-17T18:11:18Z

for CIs like github actions there is a measurable difference when running the e2e test suite :)

Yeah - it's just another potential failure mode that would be nice to avoid

cnfatal · 2022-03-08T02:53:30Z

Below config file works well to run etcd in memory

kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
kubeadmConfigPatches:
- |
  apiVersion: kubeadm.k8s.io/v1beta2
  kind: ClusterConfiguration
  etcd:
    local:
      dataDir: /tmp/etcd

The /tmp and /run dir in kind node mount at a tmpfs.

on podman :

kind/pkg/cluster/internal/providers/podman/provision.go

Lines 195 to 196 in 36f229f

    
           "--tmpfs", "/tmp", // various things depend on working /tmp 
        
           "--tmpfs", "/run", // systemd wants a writable /run

on docker:

kind/pkg/cluster/internal/providers/docker/provision.go

Lines 236 to 237 in 5657682

    
           "--tmpfs", "/tmp", // various things depend on working /tmp 
        
           "--tmpfs", "/run", // systemd wants a writable /run

aojea · 2022-09-28T08:49:12Z

as pointed out by Ben , we are going to have a performance hit because of etcd

All single node v3.x clusters are affected. Fix is expected to come with a 4-10% performance degradation, making single node cluster performance more in line with multi-node clusters. No performance change is expected for multi-node clusters.

kubernetes/kubernetes#112690

BenTheElder · 2022-09-28T17:45:01Z

This should work for all current supported Kubernetes versions and is slightly terser:

kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
kubeadmConfigPatches:
- |
  kind: ClusterConfiguration
  etcd:
    local:
      dataDir: /tmp/etcd

BenTheElder · 2022-09-28T22:11:38Z

Maybe let's make a page to cover performance @aojea?

We have other related commonly discovered issues that are only in "known issues" currently. We could leave stub entries but move performance considerations to a new docs page that covers this technique + inotify limits etc.

I think the config to enable this is small enough to just document and it's too breaking to e.g. enable by default.

We can also suggest other tunable flags and host configs some of which kind shouldn't touch itself.

aojea · 2022-09-29T09:55:36Z

agree, these are recurrent questions, better to aggregate this information

aojea added the kind/feature Categorizes issue or PR as related to a new feature. label Sep 7, 2019

aojea closed this as completed Sep 7, 2019

aojea reopened this Oct 4, 2019

BenTheElder mentioned this issue Oct 14, 2019

Infrequent failed to init node with kubeadm: exit status 1 on create cluster #928

Closed

BenTheElder added kind/design Categorizes issue or PR as related to design. priority/backlog Higher priority than priority/awaiting-more-evidence. labels Nov 6, 2019

BenTheElder mentioned this issue Jan 8, 2020

RFE: Constrained deployment scenarios (Edge) kubernetes/kubeadm#2000

Closed

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 4, 2020

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Mar 5, 2020

cofyc mentioned this issue Jun 3, 2020

e2e: use memory storage for etcd pingcap/tidb-operator#2619

Closed

shonge mentioned this issue Jun 5, 2020

E2e CI job use memory storage for etcd pingcap/tidb-operator#2633

Merged

sre-bot mentioned this issue Jun 9, 2020

E2e CI job use memory storage for etcd (#2633) pingcap/tidb-operator#2663

Merged

aojea mentioned this issue Jun 14, 2020

kind: run etcd in standalone mode no need for RAFT ovn-kubernetes/ovn-kubernetes#1424

Closed

BenTheElder mentioned this issue Nov 12, 2020

etcd timeout errors on DinD setup using Prow #1922

Closed

This was referenced Nov 15, 2020

SRIOV lane: Mount in memory directory for backing in-memory etcd data kubevirt/project-infra#709

Closed

KIND infra, SRIOV proivder, Run in-memory etcd kubevirt/kubevirtci#478

Merged

dprotaso mentioned this issue Jun 12, 2021

use tmpfs for etcd storage in GitHub Actions Kind-e2e knative/serving#11511

Closed

llhuii mentioned this issue Jul 15, 2021

Use memory storage for etcd in e2e github action to speed up kubeedge/sedna#122

Open

aojea mentioned this issue Aug 17, 2021

kind create cluster fails with "ERROR: failed to create cluster" (slow disk operations?) #2416

Closed

kimwnasptd mentioned this issue Feb 7, 2022

tests: Scripts for e2e tests kubeflow/manifests#2128

Merged

erichorwath mentioned this issue Mar 24, 2022

etcdserver: read-only range request ... took too long (...) to execute #2692

Open

swiatekm mentioned this issue Oct 7, 2022

test(integration): run etcd in memory for kind SumoLogic/sumologic-kubernetes-collection#2548

Merged

pablochacin mentioned this issue Jun 3, 2023

Use memory storage for etcd in e2d2 test clusters grafana/xk6-disruptor#179

Closed

dims mentioned this issue Dec 26, 2023

Run etcd using tmpfs storage kubernetes-sigs/hydrophone#61

Merged

liangyuanpeng mentioned this issue Apr 15, 2024

timed out waiting for the condition #3569

Closed

sdowell mentioned this issue Jul 17, 2024

test: update kind to use in-memory etcd GoogleContainerTools/kpt-config-sync#1341

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use memory storage for etcd #845

Use memory storage for etcd #845

aojea commented Sep 7, 2019 •

edited

Loading

aojea commented Sep 7, 2019

neolit123 commented Sep 7, 2019

BenTheElder commented Sep 7, 2019

aojea commented Sep 7, 2019 •

edited

Loading

BenTheElder commented Sep 7, 2019

BenTheElder commented Sep 7, 2019

aojea commented Sep 7, 2019

aojea commented Sep 7, 2019

aojea commented Oct 4, 2019

aojea commented Oct 7, 2019 •

edited

Loading

neolit123 commented Oct 7, 2019

BenTheElder commented Oct 7, 2019

aojea commented Oct 7, 2019

aojea commented Oct 15, 2019

BenTheElder commented Oct 15, 2019

BenTheElder commented Nov 6, 2019

aojea commented Nov 6, 2019

fejta-bot commented Feb 4, 2020

fejta-bot commented Mar 5, 2020

BenTheElder commented Mar 5, 2020

aojea commented May 29, 2020 •

edited

Loading

warmchang commented Jun 5, 2020

BenTheElder commented Jun 6, 2020

BenTheElder commented Jan 19, 2021

aojea commented Jan 20, 2021

dprotaso commented Jan 17, 2022

cnfatal commented Mar 8, 2022 •

edited

Loading

aojea commented Sep 28, 2022

BenTheElder commented Sep 28, 2022 •

edited

Loading

BenTheElder commented Sep 28, 2022

aojea commented Sep 29, 2022

Use memory storage for etcd #845

Use memory storage for etcd #845

Comments

aojea commented Sep 7, 2019 • edited Loading

aojea commented Sep 7, 2019

neolit123 commented Sep 7, 2019

BenTheElder commented Sep 7, 2019

aojea commented Sep 7, 2019 • edited Loading

BenTheElder commented Sep 7, 2019

BenTheElder commented Sep 7, 2019

aojea commented Sep 7, 2019

aojea commented Sep 7, 2019

aojea commented Oct 4, 2019

aojea commented Oct 7, 2019 • edited Loading

neolit123 commented Oct 7, 2019

BenTheElder commented Oct 7, 2019

aojea commented Oct 7, 2019

aojea commented Oct 15, 2019

BenTheElder commented Oct 15, 2019

BenTheElder commented Nov 6, 2019

aojea commented Nov 6, 2019

fejta-bot commented Feb 4, 2020

fejta-bot commented Mar 5, 2020

BenTheElder commented Mar 5, 2020

aojea commented May 29, 2020 • edited Loading

warmchang commented Jun 5, 2020

BenTheElder commented Jun 6, 2020

BenTheElder commented Jan 19, 2021

aojea commented Jan 20, 2021

dprotaso commented Jan 17, 2022

cnfatal commented Mar 8, 2022 • edited Loading

aojea commented Sep 28, 2022

BenTheElder commented Sep 28, 2022 • edited Loading

BenTheElder commented Sep 28, 2022

aojea commented Sep 29, 2022

aojea commented Sep 7, 2019 •

edited

Loading

aojea commented Sep 7, 2019 •

edited

Loading

aojea commented Oct 7, 2019 •

edited

Loading

aojea commented May 29, 2020 •

edited

Loading

cnfatal commented Mar 8, 2022 •

edited

Loading

BenTheElder commented Sep 28, 2022 •

edited

Loading