Skip to content

Commit

Permalink
kubeadm: adapt docs for 1.24 and dockershim removal
Browse files Browse the repository at this point in the history
Touch the following files:
- Implementation details: remove docker specifics, which is changing
in 1.24
- Create cluster: small language cleanup, remove note about 1.24
- Install kubeadm: Include two up-to-date tables for Linux / Windows
with known endpoints. Include cri-dockerd.
- Kubelet integration: (side cleanup) use "container runtime" instead of
"CRI runtime" (which is incorrect). Mention that only updating
"--container-runtime-endpoint=.." is required if the user wishes
to override the CR on a certain host. Dockershim->CR-foo migration
guides would make the "--container-runtime=remote" flag explicit
and we want to remove it at some point.
- Troubleshooting kubeadm: Remove some instances of Docker troubleshooting
that imply docker as default CR, or talk about old Docker versions.
Be more generic about container runtimes. Include more
cleanups on the side - kube-dns, outdated entries, fixed bugs.
- Adding Windows nodes: move the containerd tab before the Docker
tab, as containerd is now the default. Remove note about being explicit
about --cri-socket. Add note that crictl is required for both
Docker and containerd. Add note that cri-dockerd is required if
the user wants to use Docker EE on Windows.
  • Loading branch information
neolit123 committed Jan 12, 2022
1 parent c8c474d commit 06e6d2c
Show file tree
Hide file tree
Showing 6 changed files with 92 additions and 179 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -81,13 +81,8 @@ The user can skip specific preflight checks or all of them with the `--ignore-pr
- Kubernetes system requirements:
- if running on linux:
- [error] if Kernel is older than the minimum required version
- [error] if required cgroups subsystem aren't in set up
- if using docker:
- [warning/error] if Docker service does not exist, if it is disabled, if it is not active.
- [error] if Docker endpoint does not exist or does not work
- [warning] if docker version is not in the list of validated docker versions
- If using other cri engine:
- [error] if crictl socket does not answer
- [error] if required cgroups subsystem aren't set up
- [error] if the CRI endpoint does not answer
- [error] if user is not root
- [error] if the machine hostname is not a valid DNS subdomain
- [warning] if the host name cannot be reached via network lookup
Expand Down Expand Up @@ -434,8 +429,7 @@ cluster startup problems.
Please note that:

1. `kubeadm join` preflight checks are basically a subset `kubeadm init` preflight checks
1. Starting from 1.9, kubeadm provides better support for CRI-generic functionality; in that case, docker specific controls
are skipped or replaced by similar controls for crictl.
1. Starting from 1.24, kubeadm uses crictl to communicate to all known CRI endpoints.
1. Starting from 1.9, kubeadm provides support for joining nodes running on Windows; in that case, linux specific controls are skipped.
1. In any case the user can skip specific preflight checks (or eventually all preflight checks) with the `--ignore-preflight-errors` option.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -111,10 +111,9 @@ for all control-plane nodes. Such an endpoint can be either a DNS name or an IP
be passed to `kubeadm init`. Depending on which
third-party provider you choose, you might need to set the `--pod-network-cidr` to
a provider-specific value. See [Installing a Pod network add-on](#pod-network).
1. (Optional) Since version 1.14, `kubeadm` tries to detect the container runtime on Linux
by using a list of well known domain socket paths. To use different container runtime or
if there are more than one installed on the provisioned node, specify the `--cri-socket`
argument to `kubeadm init`. See
1. (Optional) `kubeadm` tries to detect the container runtime by using a list of well
known endpoints. To use different container runtime or if there are more than one installed
on the provisioned node, specify the `--cri-socket` argument to `kubeadm`. See
[Installing a runtime](/docs/setup/production-environment/tools/kubeadm/install-kubeadm/#installing-runtime).
1. (Optional) Unless otherwise specified, `kubeadm` uses the network interface associated
with the default gateway to set the advertise address for this particular control-plane node's API server.
Expand Down Expand Up @@ -384,8 +383,8 @@ A few seconds later, you should notice this node in the output from `kubectl get
nodes` when run on the control-plane node.

{{< note >}}
As the cluster nodes are usually initialized sequentially, the CoreDNS Pods are likely to all run
on the first control-plane node. To provide higher availability, please rebalance the CoreDNS Pods
As the cluster nodes are usually initialized sequentially, the CoreDNS Pods are likely to all run
on the first control-plane node. To provide higher availability, please rebalance the CoreDNS Pods
with `kubectl -n kube-system rollout restart deployment coredns` after at least one new node is joined.
{{< /note >}}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -79,51 +79,51 @@ The pod network plugin you use (see below) may also require certain ports to be
open. Since this differs with each pod network plugin, please see the
documentation for the plugins about what port(s) those need.

## Installing runtime {#installing-runtime}
## Installing a container runtime {#installing-runtime}

To run containers in Pods, Kubernetes uses a
{{< glossary_tooltip term_id="container-runtime" text="container runtime" >}}.

{{< tabs name="container_runtime" >}}
{{% tab name="Linux nodes" %}}

By default, Kubernetes uses the
{{< glossary_tooltip term_id="cri" text="Container Runtime Interface">}} (CRI)
to interface with your chosen container runtime.

If you don't specify a runtime, kubeadm automatically tries to detect an installed
container runtime by scanning through a list of well known Unix domain sockets.
The following table lists container runtimes and their associated socket paths:

{{< table caption = "Container runtimes and their socket paths" >}}
| Runtime | Path to Unix domain socket |
|------------|-----------------------------------|
| Docker | `/var/run/dockershim.sock` |
| containerd | `/run/containerd/containerd.sock` |
| CRI-O | `/var/run/crio/crio.sock` |
{{< /table >}}

<br />
If both Docker and containerd are detected, Docker takes precedence. This is
needed because Docker 18.09 ships with containerd and both are detectable even if you only
installed Docker.
If any other two or more runtimes are detected, kubeadm exits with an error.
container runtime by scanning through a list of known endpoints.

The kubelet integrates with Docker through the built-in `dockershim` CRI implementation.
If multiple or no container runtimes are detected kubeadm will throw an error
and will request that you specify which one you want to use.

See [container runtimes](/docs/setup/production-environment/container-runtimes/)
for more information.

This tables below include the known endpoints for supported operating systems:

{{< tabs name="container_runtime" >}}
{{% tab name="Linux" %}}

{{< table >}}
| Runtime | Path to Unix domain socket |
|--------------|----------------------------------------------|
| containerd | `unix:///var/run/containerd/containerd.sock` |
| cri-dockerd | `unix:///var/run/cri-dockerd.sock` |
| cri-o | `unix:///var/run/crio/crio.sock` |
{{< /table >}}

{{% /tab %}}
{{% tab name="other operating systems" %}}
By default, kubeadm uses {{< glossary_tooltip term_id="docker" >}} as the container runtime.
The kubelet integrates with Docker through the built-in `dockershim` CRI implementation.

See [container runtimes](/docs/setup/production-environment/container-runtimes/)
for more information.
{{% tab name="Windows" %}}

{{< table >}}
| Runtime | Path to Windows named pipe |
|--------------|----------------------------------------------|
| containerd | `npipe:////./pipe/containerd-containerd` |
| cri-dockerd | `npipe:////./pipe/cri-dockerd` |
{{< /table >}}

{{% /tab %}}
{{< /tabs >}}


## Installing kubeadm, kubelet and kubectl

You will install these packages on all of your machines:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ using kubeadm, rather than managing the kubelet configuration for each Node manu
### Propagating cluster-level configuration to each kubelet

You can provide the kubelet with default values to be used by `kubeadm init` and `kubeadm join`
commands. Interesting examples include using a different CRI runtime or setting the default subnet
commands. Interesting examples include using a different container runtime or setting the default subnet
used by services.

If you want your services to use the subnet `10.96.0.0/12` as the default for services, you can pass
Expand All @@ -51,7 +51,7 @@ by the kubelet, using the `--cluster-dns` flag. This setting needs to be the sam
on every manager and Node in the cluster. The kubelet provides a versioned, structured API object
that can configure most parameters in the kubelet and push out this configuration to each running
kubelet in the cluster. This object is called
[`KubeletConfiguration`](/docs/reference/config-api/kubelet-config.v1beta1/).
[`KubeletConfiguration`](/docs/reference/config-api/kubelet-config.v1beta1/).
The `KubeletConfiguration` allows the user to specify flags such as the cluster DNS IP addresses expressed as
a list of values to a camelCased key, illustrated by the following example:

Expand All @@ -78,14 +78,12 @@ networking, or other host-specific parameters. The following list provides a few
unless you are using a cloud provider. You can use the `--hostname-override` flag to override the
default behavior if you need to specify a Node name different from the machine's hostname.

- Currently, the kubelet cannot automatically detect the cgroup driver used by the CRI runtime,
but the value of `--cgroup-driver` must match the cgroup driver used by the CRI runtime to ensure
- Currently, the kubelet cannot automatically detect the cgroup driver used by the container runtime,
but the value of `--cgroup-driver` must match the cgroup driver used by the container runtime to ensure
the health of the kubelet.

- Depending on the CRI runtime your cluster uses, you may need to specify different flags to the kubelet.
For instance, when using Docker, you need to specify flags such as `--network-plugin=cni`, but if you
are using an external runtime, you need to specify `--container-runtime=remote` and specify the CRI
endpoint using the `--container-runtime-endpoint=<path>`.
- To specify the container runtime you must set its endpoint with the
`--container-runtime-endpoint=<path>` flag.

You can specify these flags by configuring an individual kubelet's configuration in your service manager,
such as systemd.
Expand Down Expand Up @@ -123,7 +121,7 @@ KUBELET_KUBEADM_ARGS="--flag1=value1 --flag2=value2 ..."
```

In addition to the flags used when starting the kubelet, the file also contains dynamic
parameters such as the cgroup driver and whether to use a different CRI runtime socket
parameters such as the cgroup driver and whether to use a different container runtime socket
(`--cri-socket`).

After marshalling these two files to disk, kubeadm attempts to run the following two
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -97,7 +97,8 @@ and investigating each container by running `docker logs`. For other container r

## kubeadm blocks when removing managed containers

The following could happen if Docker halts and does not remove any Kubernetes-managed containers:
The following could happen if the container runtime halts and does not remove
any Kubernetes-managed containers:

```shell
sudo kubeadm reset
Expand All @@ -111,35 +112,22 @@ sudo kubeadm reset
(block)
```

A possible solution is to restart the Docker service and then re-run `kubeadm reset`:

```shell
sudo systemctl restart docker.service
sudo kubeadm reset
```

Inspecting the logs for docker may also be useful:

```shell
journalctl -u docker
```
A possible solution is to restart the container runtime and then re-run `kubeadm reset`.
You can also use `crictl` to debug the state of the container runtime. See
[Debugging Kubernetes nodes with crictl](/docs/tasks/debug-application-cluster/crictl/).

## Pods in `RunContainerError`, `CrashLoopBackOff` or `Error` state

Right after `kubeadm init` there should not be any pods in these states.

- If there are pods in one of these states _right after_ `kubeadm init`, please open an
issue in the kubeadm repo. `coredns` (or `kube-dns`) should be in the `Pending` state
issue in the kubeadm repo. `coredns` should be in the `Pending` state
until you have deployed the network add-on.
- If you see Pods in the `RunContainerError`, `CrashLoopBackOff` or `Error` state
after deploying the network add-on and nothing happens to `coredns` (or `kube-dns`),
after deploying the network add-on and nothing happens to `coredns`,
it's very likely that the Pod Network add-on that you installed is somehow broken.
You might have to grant it more RBAC privileges or use a newer version. Please file
an issue in the Pod Network providers' issue tracker and get the issue triaged there.
- If you install a version of Docker older than 1.12.1, remove the `MountFlags=slave` option
when booting `dockerd` with `systemd` and restart `docker`. You can see the MountFlags in `/usr/lib/systemd/system/docker.service`.
MountFlags can interfere with volumes mounted by Kubernetes, and put the Pods in `CrashLoopBackOff` state.
The error happens when Kubernetes does not find `var/run/secrets/kubernetes.io/serviceaccount` files.

## `coredns` is stuck in the `Pending` state

Expand All @@ -148,6 +136,10 @@ should [install the pod network add-on](/docs/concepts/cluster-administration/ad
of choice. You have to install a Pod Network
before CoreDNS may be deployed fully. Hence the `Pending` state before the network is set up.

This is also known to happen if the kubelet version is newer than the version of
the kube-apiserver. Make sure to follow the supported version skew between the two components
as outline in the [Version Skew Policy](https://kubernetes.io/releases/version-skew-policy/#kubelet).

## `HostPort` services do not work

The `HostPort` and `HostIP` functionality is available depending on your Pod Network
Expand Down Expand Up @@ -282,70 +274,9 @@ Error from server: Get https://10.19.0.41:10250/containerLogs/default/mysql-ddc6

## `coredns` pods have `CrashLoopBackOff` or `Error` state

If you have nodes that are running SELinux with an older version of Docker you might experience a scenario
where the `coredns` pods are not starting. To solve that you can try one of the following options:

- Upgrade to a [newer version of Docker](/docs/setup/production-environment/container-runtimes/#docker).

- [Disable SELinux](https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html/security-enhanced_linux/sect-security-enhanced_linux-enabling_and_disabling_selinux-disabling_selinux).
- Modify the `coredns` deployment to set `allowPrivilegeEscalation` to `true`:

```bash
kubectl -n kube-system get deployment coredns -o yaml | \
sed 's/allowPrivilegeEscalation: false/allowPrivilegeEscalation: true/g' | \
kubectl apply -f -
```

Another cause for CoreDNS to have `CrashLoopBackOff` is when a CoreDNS Pod deployed in Kubernetes detects a loop. [A number of workarounds](https://github.com/coredns/coredns/tree/master/plugin/loop#troubleshooting-loops-in-kubernetes-clusters)
One cause for CoreDNS to have `CrashLoopBackOff` is when a CoreDNS Pod deployed in Kubernetes detects a loop. [A number of workarounds](https://github.com/coredns/coredns/tree/master/plugin/loop#troubleshooting-loops-in-kubernetes-clusters)
are available to avoid Kubernetes trying to restart the CoreDNS Pod every time CoreDNS detects the loop and exits.

{{< warning >}}
Disabling SELinux or setting `allowPrivilegeEscalation` to `true` can compromise
the security of your cluster.
{{< /warning >}}

## etcd pods restart continually

If you encounter the following error:

```
rpc error: code = 2 desc = oci runtime error: exec failed: container_linux.go:247: starting container process caused "process_linux.go:110: decoding init error from pipe caused \"read parent: connection reset by peer\""
```

this issue appears if you run CentOS 7 with Docker 1.13.1.84.
This version of Docker can prevent the kubelet from executing into the etcd container.

To work around the issue, choose one of these options:

- Roll back to an earlier version of Docker, such as 1.13.1-75
```
yum downgrade docker-1.13.1-75.git8633870.el7.centos.x86_64 docker-client-1.13.1-75.git8633870.el7.centos.x86_64 docker-common-1.13.1-75.git8633870.el7.centos.x86_64
```

- Install one of the more recent recommended versions, such as 18.06:
```bash
sudo yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
yum install docker-ce-18.06.1.ce-3.el7.x86_64
```

## Not possible to pass a comma separated list of values to arguments inside a `--component-extra-args` flag

`kubeadm init` flags such as `--component-extra-args` allow you to pass custom arguments to a control-plane
component like the kube-apiserver. However, this mechanism is limited due to the underlying type used for parsing
the values (`mapStringString`).

If you decide to pass an argument that supports multiple, comma-separated values such as
`--apiserver-extra-args "enable-admission-plugins=LimitRanger,NamespaceExists"` this flag will fail with
`flag: malformed pair, expect string=string`. This happens because the list of arguments for
`--apiserver-extra-args` expects `key=value` pairs and in this case `NamespacesExists` is considered
as a key that is missing a value.

Alternatively, you can try separating the `key=value` pairs like so:
`--apiserver-extra-args "enable-admission-plugins=LimitRanger,enable-admission-plugins=NamespaceExists"`
but this will result in the key `enable-admission-plugins` only having the value of `NamespaceExists`.

A known workaround is to use the kubeadm [configuration file](/docs/reference/config-api/kubeadm-config.v1beta3/).

## kube-proxy scheduled before node is initialized by cloud-controller-manager

In cloud provider scenarios, kube-proxy can end up being scheduled on new worker nodes before
Expand Down Expand Up @@ -410,20 +341,6 @@ nodeRegistration:
Alternatively, you can modify `/etc/fstab` to make the `/usr` mount writeable, but please
be advised that this is modifying a design principle of the Linux distribution.

## `kubeadm upgrade plan` prints out `context deadline exceeded` error message

This error message is shown when upgrading a Kubernetes cluster with `kubeadm` in the case of running an external etcd. This is not a critical bug and happens because older versions of kubeadm perform a version check on the external etcd cluster. You can proceed with `kubeadm upgrade apply ...`.

This issue is fixed as of version 1.19.

## `kubeadm reset` unmounts `/var/lib/kubelet`

If `/var/lib/kubelet` is being mounted, performing a `kubeadm reset` will effectively unmount it.

To workaround the issue, re-mount the `/var/lib/kubelet` directory after performing the `kubeadm reset` operation.

This is a regression introduced in kubeadm 1.15. The issue is fixed in 1.20.

## Cannot use the metrics-server securely in a kubeadm cluster

In a kubeadm cluster, the [metrics-server](https://github.com/kubernetes-sigs/metrics-server)
Expand Down
Loading

0 comments on commit 06e6d2c

Please sign in to comment.