etcd is a strongly consistent key-value store and the most prevalent choice for the Kubernetes
persistence layer. All API cluster objects like Pod
s, Deployment
s, Secret
s, etc. are stored in etcd
which
makes it an essential part of a Kubernetes control plane.
Each shoot cluster gets its very own persistence for the control plane. It runs in the shoot namespace on the respective
seed cluster. Concretely, there are two etcd instances per shoot cluster which the Kube-Apiserver
is configured
to use in the following way:
- etcd-main
A store that contains all "cluster critical" or "long-term" objects. These object kinds are typically considered for a backup to prevent any data loss.
- etcd-events
A store that contains all Event
objects (events.k8s.io
) of a cluster. Events
have usually a short retention
period, occur frequently but are not essential for a disaster recovery.
The setup above prevents both, the critical etcd-main
is not flooded by Kubernetes Events
as well as backup space is
not occupied by non-critical data. This segmentation saves time and resources.
Configuring, maintaining and health-checking etcd
is outsourced to a dedicated operator called ETCD Druid.
When Gardenlet reconciles a Shoot
resource, it creates or updates an Etcd
resources in the seed cluster, containing necessary information (backup information, defragmentation schedule, resources, etc.) etcd-druid
needs to manage the lifecycle of the desired etcd instance (today main
or events
). Likewise, when the shoot is deleted,
Gardenlet deletes the Etcd
resource and ETCD Druid takes care about cleaning up
all related objects, e.g. the backing StatefulSet
.
Gardenlet maintains HVPA
objects for etcd StatefulSet
s if the corresponding feature gate is enabled. This enables
a vertical scaling for etcd
. Downscaling is handled more pessimistic to prevent many subsequent etcd
restarts. Thus,
for production
and infrastructure
clusters downscaling is deactivated and for all other clusters lower advertised requests/limits are only
applied during a shoot's maintenance time window.
If Seed
s specify backups for etcd (example),
then Gardener and the respective provider extensions are responsible for creating a bucket
on the cloud provider's side (modelled through BackupBucket resource). The bucket stores
backups of shoots scheduled on that seed. Furthermore, Gardener creates a BackupEntry
which subdivides the bucket and thus makes it possible to store backups of multiple shoot clusters.
The etcd-main
instance itself is configured to run with a special backup-restore sidecar. It takes care about regularly
backing up etcd data and restoring it in case of data loss. More information can be found on the component's GitHub
page https://github.com/gardener/etcd-backup-restore.
How long backups are stored in the bucket after a shoot has been deleted, depends on the configured retention period in the
Seed
resource. Please see this example configuration for more information.
etcd maintenance tasks must be performed from time to time in order to re-gain database storage and to ensure the system's reliability. The backup-restore sidecar takes care about this job as well. Gardener chooses a random time within the shoot's maintenance time to schedule these tasks.