When we had just left off in the previous episode we had learned about pods, multiple containers and volumes. We'll now cover some slightly more advanced topics in Kubernetes, related to application productionization, deployment and scaling.
Having already learned about Pods and how to create them, you may be struck by an urge to create many, many pods. Please do! But eventually you will need a system to organize these pods into groups. The system for achieving this in Kubernetes is Labels. Labels are key-value pairs that are attached to each API object in Kubernetes. Label selectors can be passed along with a RESTful list
request to the apiserver to retrieve a list of objects which match that label selector. For example:
cluster/kubecfg.sh -l name=nginx list pods
Lists all pods who name label matches 'nginx'. Labels are discussed in detail elsewhere, but they are a core concept for two additional building blocks for Kubernetes, Replication Controllers and Services
OK, now you have an awesome, multi-container, labelled pod and you want to use it to build an application, you might be tempted to just start building a whole bunch of individual pods, but if you do that, a whole host of operational concerns pop up. For example: how will you scale the number of pods up or down and how will you ensure that all pods are homogenous?
Replication controllers are the objects to answer these questions. A replication controller combines a template for pod creation (a "cookie-cutter" if you will) and a number of desired replicas, into a single API object. The replica controller also contains a label selector that identifies the set of objects managed by the replica controller. The replica controller constantly measures the size of this set relative to the desired size, and takes action by creating or deleting pods. The design of replica controllers is discussed in detail elsewhere.
An example replica controller that instantiates two pods running nginx looks like:
id: nginxController
apiVersion: v1beta1
kind: ReplicationController
desiredState:
replicas: 2
# replicaSelector identifies the set of Pods that this
# replicaController is responsible for managing
replicaSelector:
name: nginx
# podTemplate defines the 'cookie cutter' used for creating
# new pods when necessary
podTemplate:
desiredState:
manifest:
version: v1beta1
id: nginx
containers:
- name: nginx
image: dockerfile/nginx
ports:
- containerPort: 80
# Important: these labels need to match the selector above
# The api server enforces this constraint.
labels:
- name: nginx
Once you have a replicated set of pods, you need an abstraction that enables connectivity between the layers of your application. For example, if you have a replication controller managing your backend jobs, you don't want to have to reconfigure your front-ends whenever you re-scale your backends. Likewise, if the pods in your backends are scheduled (or rescheduled) onto different machines, you can't be required to re-configure your front-ends. In Kubernetes the Service API object achieves these goals. A Service basically combines an IP address and a label selector together to form a simple, static rallying point for connecting to a micro-service in your application.
For example, here is a service that balances across the pods created in the previous nginx replication controller example:
kind: Service
apiVersion: v1beta1
# must be a DNS compatible name
id: nginx-example
# the port that this service should serve on
port: 8000
# just like the selector in the replication controller,
# but this time it identifies the set of pods to load balance
# traffic to.
selector:
name: nginx
# the container on each pod to connect to, can be a name
# (e.g. 'www') or a number (e.g. 80)
containerPort: 80
When created, each service is assigned a unique IP address. This address is tied to the lifespan of the Service, and will not change while the Service is alive. Pods can be configured to talk to the service, and know that communication to the service will be automatically load-balanced out to some pod that is a member of the set identified by the label selector in the Service. Services are described in detail elsewhere.
When I write code it never crashes, right? Sadly the kubernetes issues list indicates otherwise...
Rather than trying to write bug-free code, a better approach is to use a management system to perform periodic health checking and repair of your application. That way, a system, outside of your application itself, is responsible for monitoring the application and taking action to fix it. It's important that the system be outside of the application, since of course, if your application fails, and the health checking agent is part of your application, it may fail as well, and you'll never know. In Kubernetes, the health check monitor is the Kubelet agent.
The simplest form of health-checking is just process level health checking. The Kubelet constantly asks the Docker daemon if the container process is still running, and if not, the container process is restarted. In all of the Kubernetes examples you have run so far, this health checking was actually already enabled. It's on for every single container that runs in Kubernetes.
However, in many cases, this low-level health checking is insufficient. Consider for example, the following code:
lockOne := sync.Mutex{}
lockTwo := sync.Mutex{}
go func() {
lockOne.Lock();
lockTwo.Lock();
...
}()
lockTwo.Lock();
lockOne.Lock();
This is a classic example of a problem in computer science known as "Deadlock". From Docker's perspective your application is still operating, the process is still running, but from your application's perspective, your code is locked up, and will never respond correctly.
To address this problem, Kubernetes supports user implemented application health-checks. These checks are performed by the Kubelet to ensure that your application is operating correctly for a definition of "correctly" that you provide.
Currently, there are three types of application health checks that you can choose from:
- HTTP Health Checks - The Kubelet will call a web hook. If it returns between 200 and 399, it is considered success, failure otherwise.
- Container Exec - The Kubelet will execute a command inside your container. If it returns "ok" it will be considered a success.
- TCP Socket - The Kubelet will attempt to open a socket to your container. If it can establish a connection, the container is considered healthy, if it can't it is considered a failure.
In all cases, if the Kubelet discovers a failure, the container is restarted.
The container health checks are configured in the "LivenessProbe" section of your container config. There you can also specify an "initialDelaySeconds" that is a grace period from when the container is started to when health checks are performed, to enable your container to perform any necessary initialization.
Here is an example config for a pod with an HTTP health check:
kind: Pod
apiVersion: v1beta1
desiredState:
manifest:
version: v1beta1
id: php
containers:
- name: nginx
image: dockerfile/nginx
ports:
- containerPort: 80
# defines the health checking
livenessProbe:
# turn on application health checking
enabled: true
type: http
# length of time to wait for a pod to initialize
# after pod startup, before applying health checking
initialDelaySeconds: 30
# an http probe
httpGet:
path: /_status/healthz
port: 8080
For a complete application see the guestbook example.