Skip to content

Kubernetes

Murilo Opsfelder Araujo edited this page Jul 20, 2021 · 3 revisions

Overview

This document describes at least two methods of configuring a Kubernetes cluster:

Both methods require that you install dependencies and apply settings on the host.

The scenario described here uses a single node for data plane and workloads.

Data plane and deployed workloads will run and get scheduled on the same node. This is not a production setup seen in the field, where data plane and workloads run on separate nodes.

This single-node setup can be helpful for development and experimentation with Kubernetes. After the cluster initialization, you can setup other nodes and join them on your cluster as you wish.

This guide leverages Kubeadm as the installer.

There are, as of the date of this writing, 19 (nineteen!) Kubernetes installers certified by CNCF.

Bear in mind that not all of the installers work on ppc64le.

In this setup, Kubernetes uses CRI-O as container runtime. This is similar to OpenShift Container Platform (OCP). For other runtimes, check here.

Environment

Tested on Fedora 32 and 33 on ppc64le.

Host settings

Disable swap as described here.

Before Fedora 33:

sed -i -r -e 's,^[^#](.*\s+swap\s+.*)$,#\1,g' /etc/fstab
swapoff -a
cat /proc/swaps

As of Fedora 33:

sudo systemctl stop swap-create@zram0
sudo touch /etc/systemd/zram-generator.conf
sudo dnf remove -y zram-generator-defaults

Set cgroups v1 and disable SELinux on boot:

sudo grubby --update-kernel=ALL --args="systemd.unified_cgroup_hierarchy=0 selinux=0"

Disable SELinux right now:

sudo setenforce 0
sudo umount /sys/fs/selinux
sudo sed -i -r -e 's/^SELINUX=(.*)$/SELINUX=disabled/g' /etc/selinux/config

Load br_netfilter module on boot and adjust system settings to allow bridged traffic:

sudo modprobe br_netfilter
echo br_netfilter | sudo tee /etc/modules-load.d/br_netfilter.conf
cat << EOF | sudo tee /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables=1
net.bridge.bridge-nf-call-iptables=1
net.ipv4.ip_forward=1
EOF
sudo sysctl --system

Allow required ports in firewalld:

sudo systemctl enable --now firewalld

for tcp_port in 6443 10250 10251 10252 2379-2380 30000-32767; do
  echo "Allowing port ${tcp_port}/tcp"
  sudo firewall-cmd --add-port=${tcp_port}/tcp
  sudo firewall-cmd --add-port=${tcp_port}/tcp --permanent
done

A reboot is only required to disable SELinux and boot with cgroups v1. If your host has already disabled SELinux and was booted with cgroups v1, you can skip this reboot:

sudo reboot

Installing from source code

To install Kubernetes from source, you will need the following components:

  • go 1.15 (required by kubernetes)
  • kubernetes
    • kubeadm
    • kubectl
    • kubelet
  • cri-o
  • cri-tools
  • conmon
  • runc
  • containernetworking/plugins

Dependencies:

sudo dnf upgrade -y --refresh
sudo dnf install -y bash-completion btrfs-progs-devel containers-common conntrack-tools device-mapper-devel ethtool git glibc-static glib2-devel golang gpgme-devel grubby libseccomp-devel make net-tools rsync socat tar

Build and install go1.15:

git clone --depth 1 --single-branch --branch release-branch.go1.15 https://go.googlesource.com/go goroot
pushd goroot/src/
export GOOS=linux GOARCH=ppc64le GOPPC64=power9
(bash make.bash)  # without tests
sudo install -o root -g root -m 0755 ../bin/go /usr/local/bin/
sudo install -o root -g root -m 0755 ../bin/gofmt /usr/local/bin/
unset GOOS GOARCH GOPP64
popd
hash -r  # forget all remembered locations to force bash detecting new go path at /usr/local/bin/go
go version

Build and install kubernetes dependencies:

go get github.com/cloudflare/cfssl/cmd/cfssl
go get github.com/cloudflare/cfssl/cmd/cfssljson
sudo install -o root -g root -m 0755 $HOME/go/bin/cfssl /usr/local/bin/
sudo install -o root -g root -m 0755 $HOME/go/bin/cfssljson /usr/local/bin/
hash -r

Build and install kubernetes:

git clone --depth 1 --single-branch --branch master https://github.com/kubernetes/kubernetes.git
pushd kubernetes/
make all ARCH=ppc64le GOLDFLAGS=""  # build everything
sudo install -o root -g root -m 0755 _output/bin/kubeadm /usr/bin/
sudo install -o root -g root -m 0755 _output/bin/kube-aggregator /usr/bin/
sudo install -o root -g root -m 0755 _output/bin/kube-apiserver /usr/bin/
sudo install -o root -g root -m 0755 _output/bin/kube-controller-manager /usr/bin/
sudo install -o root -g root -m 0755 _output/bin/kubectl /usr/bin/
sudo install -o root -g root -m 0755 _output/bin/kubelet /usr/bin/
sudo install -o root -g root -m 0755 _output/bin/kubemark /usr/bin/
sudo install -o root -g root -m 0755 _output/bin/kube-proxy /usr/bin/
sudo install -o root -g root -m 0755 _output/bin/kube-scheduler /usr/bin/
popd

Build and install containernetworking/plugins:

git clone --depth 1 --single-branch --branch master https://github.com/containernetworking/plugins.git containernetworking/plugins
pushd containernetworking/plugins/
./build_linux.sh
mkdir -p /opt/cni/bin
cp ./bin/* /opt/cni/bin/
popd

Build and install conmon:

git clone --depth 1 --single-branch --branch master https://github.com/containers/conmon.git
pushd conmon/
make
sudo make install
popd

Build and install CRI-O and cri-tools:

git clone --depth 1 --single-branch --branch master https://github.com/cri-o/cri-o.git
pushd cri-o/
make BUILDTAGS="seccomp selinux"
sudo make install
sudo mkdir -p /etc/cni/net.d/
sudo install -o root -g root -m 0644 contrib/cni/10-crio-bridge.conf /etc/cni/net.d/
sudo install -o root -g root -m 0644 contrib/cni/99-loopback.conf /etc/cni/net.d/
popd
sudo systemctl daemon-reload
sudo systemctl enable crio
git clone --depth 1 --single-branch --branch master https://github.com/kubernetes-sigs/cri-tools.git
pushd cri-tools/
make
sudo make install
popd

Build and install runc:

git clone --depth 1 --single-branch --branch master https://github.com/opencontainers/runc.git
pushd runc/
make
sudo make install
popd

Configure kubelet service startup parameters:

sudo mkdir -p /usr/lib/systemd/system/kubelet.service.d

cat << "EOF" | sudo tee /usr/lib/systemd/system/kubelet.service.d/10-kubeadm.conf
[Service]
Environment="KUBELET_KUBECONFIG_ARGS=--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf"
Environment="KUBELET_CONFIG_ARGS=--config=/var/lib/kubelet/config.yaml"
EnvironmentFile=-/var/lib/kubelet/kubeadm-flags.env
EnvironmentFile=-/etc/sysconfig/kubelet
ExecStart=
ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS
EOF
cat << "EOF" | sudo tee /usr/lib/systemd/system/kubelet.service
[Unit]
Description=kubelet: The Kubernetes Node Agent
Documentation=https://kubernetes.io/docs/
Wants=network-online.target
After=network-online.target
[Service]
ExecStart=/usr/bin/kubelet
Restart=always
StartLimitInterval=0
RestartSec=10
[Install]
WantedBy=multi-user.target
EOF
sudo systemctl daemon-reload
sudo systemctl enable kubelet

Installing pre-built packages

These steps are derived from here.

Setup Kubernetes repository:

cat << "EOF" | sudo tee /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://packages.cloud.google.com/yum/repos/kubernetes-el7-$basearch
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://packages.cloud.google.com/yum/doc/yum-key.gpg https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg
exclude=kubelet kubeadm kubectl
EOF

Import GPG keys from Kubernetes repositories:

sudo rpmkeys --import https://packages.cloud.google.com/yum/doc/yum-key.gpg
sudo rpmkeys --import https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg

Upgrade your system first:

sudo dnf upgrade -y --refresh

Install CRI-O:

sudo dnf module list cri-o
sudo dnf module enable -y cri-o:1.19

Install Kubelet, Kubeadm, and Kubectl:

sudo dnf install -y crio cri-tools kubelet kubeadm kubectl --disableexcludes=kubernetes

Enable services:

sudo systemctl enable crio
sudo systemctl enable kubelet

Cluster initialization

This step consists on generating the kubeadm.yaml file with your cluster configuration. This file will be passed to kubeadm command.

Some choices can be made at this point. You can, for example, initialize your cluster with standard settings, use stable or latest Kubernetes images, tweak network settings to support Open vSwitch.

Choose one below or create your own kubeadm.yaml with your customized settings.

kubeadm.yaml for stable images - standard

If you are not sure about which settings to apply on your cluster, this might be helpful as a good starting point.

cat << "EOF" | tee kubeadm.yaml
apiVersion: kubeadm.k8s.io/v1beta2
kind: InitConfiguration
nodeRegistration:
  criSocket: "unix:///var/run/crio/crio.sock"
  kubeletExtraArgs:
    cgroup-driver: "systemd"
  ignorePreflightErrors:
  - IsPrivilegedUser
  - KubeletVersion
  - Swap
  - Port-10250
---
apiVersion: kubeadm.k8s.io/v1beta2
kind: ClusterConfiguration
networking:
  serviceSubnet: "10.96.0.0/12"
  podSubnet: "10.100.0.1/24"
  dnsDomain: "cluster.local"
kubernetesVersion: "stable"
---
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
cgroupDriver: systemd
EOF

kubeadm.yaml for latest images

This is suitable for development environments or when you want the latest and greatest Kubernetes images.

cat << "EOF" | tee kubeadm.yaml
apiVersion: kubeadm.k8s.io/v1beta2
kind: InitConfiguration
nodeRegistration:
  criSocket: "unix:///var/run/crio/crio.sock"
  kubeletExtraArgs:
    cgroup-driver: "systemd"
  ignorePreflightErrors:
  - IsPrivilegedUser
  - KubeletVersion
  - Swap
  - Port-10250
---
apiVersion: kubeadm.k8s.io/v1beta2
kind: ClusterConfiguration
networking:
  serviceSubnet: "10.96.0.0/12"
  podSubnet: "10.100.0.1/24"
  dnsDomain: "cluster.local"
kubernetesVersion: "latest"
---
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
cgroupDriver: systemd
EOF

kubeadm.yaml for Open vSwitch

This kubeadm.yaml is similar to the standard one but does not contain the network settings and uses the stable set of Kubernetes images:

cat << "EOF" | tee kubeadm.yaml
apiVersion: kubeadm.k8s.io/v1beta2
kind: InitConfiguration
nodeRegistration:
  criSocket: "unix:///var/run/crio/crio.sock"
  kubeletExtraArgs:
    cgroup-driver: "systemd"
  ignorePreflightErrors:
  - IsPrivilegedUser
  - KubeletVersion
  - Swap
  - Port-10250
---
apiVersion: kubeadm.k8s.io/v1beta2
kind: ClusterConfiguration
kubernetesVersion: "stable"
---
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
cgroupDriver: systemd
EOF

Pull Kubernetes images

Pull in latest images from Google Cloud Registry:

sudo kubeadm --v=5 --cri-socket /var/run/crio/crio.sock config images pull --kubernetes-version latest

You can also pull stable images:

sudo kubeadm --v=5 --cri-socket /var/run/crio/crio.sock config images pull --kubernetes-version stable

Initialize cluster

Now that you already have your cluster configuration in kubeadm.yaml and already pulled in the images, let's initialize it:

sudo kubeadm --v=5 init --config kubeadm.yaml

Configure ~/.kube

Copy cluster configuration to your $HOME/.kube/ directory:

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

Verify

Verify that your cluster is up and running:

kubectl cluster-info

Remove NoSchedule taint so you can deploy containers on your master node:

kubectl taint nodes --all node-role.kubernetes.io/master:NoSchedule-

Test

You can test your cluster with a simple nginx deployment, for example:

cat << "EOF" > nginx-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
  labels:
    app: nginx
spec:
  replicas: 3
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx
        imagePullPolicy: IfNotPresent
        ports:
        - containerPort: 80
EOF

kubectl apply -f nginx-deployment.yaml

You're done! Enjoy your kubernetes environment! ;)


Customizations

You can also apply customizations to your Kubernetes cluster, e.g.: use Open vSwitch as the CNI provider instead of Linux bridge (default from CRI-O) or tune node settings for scaling up and overcoming the default limitations as to number of pods per node, per CPU, and number of queries per second allowed in Kubernetes API server.

Open vSwitch as CNI provider

In this customization, we swap the default Linux bridge configuration in CRI-O in favor of using Open vSwitch bridges.

We still need host-local and loopback CNI plugins:

git clone --depth 1 --single-branch --branch master https://github.com/containernetworking/plugins.git containernetworking/plugins
pushd containernetworking/plugins/
./build_linux.sh
mkdir -p /opt/cni/bin
sudo install -o root -g root -m 0755 ./bin/host-local ./bin/loopback /opt/cni/bin/
popd

Build and install ovs-cni plugin from Kubevirt:

git clone --depth 1 --single-branch --branch master https://github.com/kubevirt/ovs-cni.git
pushd ovs-cni/
sed -i.orig 's/amd64/ppc64le/g' Makefile
sed -i.orig 's/amd64/ppc64le/g' hack/install-go.sh
make build-plugin
mkdir -p /opt/cni/bin
sudo install -o root -g root -m 0755 cmd/plugin/plugin /opt/cni/bin/ovs
popd

Remove CRI-O bridge settings:

sudo rm -f /etc/cni/net.d/100-crio-bridge.conf

Set ovs as the default CNI provider:

cat << EOF | sudo tee /etc/cni/net.d/000-ovs.conf
{
    "cniVersion": "0.3.1",
    "name": "ovs",
    "type": "ovs",
    "bridge": "ovscnibr0",
    "isGateway": true,
    "ipMasq": true,
    "hairpinMode": true,
    "ipam": {
        "type": "host-local",
        "routes": [
            { "dst": "0.0.0.0/0" },
            { "dst": "1100:200::1/24" }
        ],
        "ranges": [
            [{ "subnet": "10.96.0.0/16" }],
            [{ "subnet": "1100:200::/24" }]
        ]
    }
}
EOF

Install and enable Open vSwitch:

sudo dnf install -y openvswitch
sudo systemctl enable --now openvswitch

Setup OVS bridge interface:

sudo ovs-vsctl add-br ovscnibr0
sudo ifconfig ovscnibr0 10.96.0.1 netmask 255.255.0.0 up
ovs-vsctl show

Restart services so new configuration takes place:

sudo systemctl restart crio
sudo systemctl restart kubelet

Scale up tuning

Tune system to allow more inotify watchers per user and instruct kernel not to swap:

cat << EOF | sudo tee /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables=1
net.bridge.bridge-nf-call-iptables=1
net.ipv4.ip_forward=1
fs.inotify.max_user_watches=999999
vm.swappiness=0
EOF

sudo sysctl --system

Tune kubelet settings to increase the number of allowed pods per node and per CPU, and disable the limitation of events per second:

cat << EOF | sudo tee /etc/sysconfig/kubelet
KUBELET_EXTRA_ARGS="--fail-swap-on=false --event-qps=0 --kube-api-qps=999 --max-pods=9999 --pods-per-core=99"
EOF

sudo systemctl restart kubelet

Single node

This customization allows pods to get scheduled on the same node where control plane pods are running:

kubectl taint nodes --all node-role.kubernetes.io/master:NoSchedule-

Increase the number of ports in a Linux bridge

Large deployments can exhaust the number of ports in a bridge. journalctl -u crio might give you an indication that the bridge cannot take more ports:

Error while adding pod to CNI network "crio": failed to connect "veth1315c848" to bridge cni0: exchange full

To increase the number of ports in a bridge to 16384, set BR_PORT_BITS to 14 in the Linux source net/bridge/br_private.h:

diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
index d7d167e10b70..1a7f67f31f52 100644
--- a/net/bridge/br_private.h
+++ b/net/bridge/br_private.h
@@ -24,7 +24,7 @@

 #define BR_HOLD_TIME (1*HZ)

-#define BR_PORT_BITS   10
+#define BR_PORT_BITS   14
 #define BR_MAX_PORTS   (1<<BR_PORT_BITS)

 #define BR_MULTICAST_DEFAULT_HASH_MAX 4096

Increase the number of entries in the ARP table

Large deployments can overflow the ARP table. In the console, you might see a message like this one:

neighbour: ndisc_cache: neighbor table overflow!

To increase the number of entries in the ARP table, tweak the values of its garbage collector for IPv4 and IPv6:

cat << EOF | sudo tee /etc/sysctl.d/k8s-arp.conf
net.ipv4.neigh.default.gc_thresh2=1024
net.ipv4.neigh.default.gc_thresh3=2048
net.ipv6.neigh.default.gc_thresh2=1024
net.ipv6.neigh.default.gc_thresh3=2048
EOF

Apply changes:

sudo sysctl --system