The goal of this toolkit is to introduce a more โcloud-nativeโ approach for pre- and post-deployment, as well as the concept of application health checks. It is an incubating project, under the umbrella of the Keptn Application Lifecycle working group.
Status of the different features:
- Observability: expose OTel metrics and traces of your deployment.
- K8s Custom Metrics: expose your Observability platform via the Custom Metric API.
- Release lifecycle: handle pre- and post-checks of your Application deployment.
- Certificate Manager: automatically configure TLS certificates for secure communication with the Kube API.
The status follows the Kubernetes API versioning schema.
For more info about the features, please refer to our documentation.
Click to watch it on YouTube:
- Kubernetes >=1.24 is needed to deploy the Lifecycle Toolkit
- The Lifecycle Toolkit is currently not compatible with vcluster
Use the following command sequence to install the latest release of the Keptn Lifecycle Toolkit:
helm repo add klt https://charts.lifecycle.keptn.sh
helm repo update
helm upgrade --install keptn klt/klt -n keptn-lifecycle-toolkit-system --create-namespace --wait
For installing the Lifecycle Toolkit via manifests use:
kubectl apply -f https://github.com/keptn/lifecycle-toolkit/releases/download/v0.8.1/manifest.yaml
The Lifecycle Toolkit uses the OpenTelemetry collector to provide a vendor-agnostic implementation of how to receive, process and export telemetry data. To install it, follow their installation instructions. We provide some information about this in our observability example.
The Lifecycle Toolkit includes a Mutating Webhook which requires TLS certificates to be mounted as a volume in its pod. The certificate creation is handled automatically by klt-cert-manager. Versions 0.6.0 and earlier have a hard dependency on the cert-manager. See installation guideline for more info.
The Keptn Lifecycle Toolkit aims to support Cloud Native teams with:
- Pre-requisite evaluation before deploying workloads and applications
- Finding out when an application (not workload) is ready and working
- Checking the Application Health in a declarative (cloud-native) way
- Standardized way for pre- and post-deployment tasks
- Provide out-of-the-box Observability of the deployment cycle
The Keptn Lifecycle Toolkit could be seen as a general purpose and declarative Level 3 operator for your Application. For this reason, the Keptn Lifecycle Toolkit is agnostic to deployment tools that are used and works with any GitOps solution.
The Keptn Lifecycle Toolkit monitors manifests that have been applied against the Kubernetes API and reacts if it finds a workload with special annotations/labels. For this, you should annotate your Workload with (at least) the following annotations:
keptn.sh/app: myAwesomeAppName
keptn.sh/workload: myAwesomeWorkload
keptn.sh/version: myAwesomeWorkloadVersion
Alternatively, you can use Kubernetes Recommended Labels to annotate your workload:
app.kubernetes.io/part-of: myAwesomeAppName
app.kubernetes.io/name: myAwesomeWorkload
app.kubernetes.io/version: myAwesomeWorkloadVersion
In general, the Keptn Annotations/Labels take precedence over the Kubernetes recommended labels. If there is no version annotation/label and there is only one container in the pod, the Lifecycle Toolkit will take the image tag as version ( if it is not "latest").
In case you want to run pre- and post-deployment checks, further annotations are necessary:
keptn.sh/pre-deployment-tasks: verify-infrastructure-problems
keptn.sh/post-deployment-tasks: slack-notification,performance-test
The value of these annotations are Keptn CRDs called KeptnTaskDefinitions. These CRDs contains re-usable "functions" that can executed before and after the deployment. In this example, before the deployment starts, a check for open problems in your infrastructure is performed. If everything is fine, the deployment continues and afterward, a slack notification is sent with the result of the deployment and a pipeline to run performance tests is invoked. Otherwise, the deployment is kept in a pending state until the infrastructure is capable to accept deployments again.
A more comprehensive example can be found in our examples folder where we use Podtato-Head to run some simple pre-deployment checks.
To run the example, use the following commands:
cd ./examples/podtatohead-deployment/
kubectl apply -f .
Afterward, you can monitor the status of the deployment using
kubectl get keptnworkloadinstance -n podtato-kubectl -w
The deployment for a Workload will stay in a Pending
state until the respective pre-deployment check is completed.
Afterward, the deployment will start and when it is Succeeded
, the post-deployment checks will start.
The Keptn Lifecycle Toolkit is composed of the following components:
- Keptn Lifecycle Operator
- Keptn Scheduler
The Keptn Lifecycle Operator contains several controllers for Keptn CRDs and a Mutating Webhook. The Keptn Scheduler ensures that Pods are started only after the pre-deployment checks have finished.
A Kubernetes Manifest, which is annotated with Keptn specific annotations, gets applied to the Kubernetes Cluster. Afterward, the Keptn Scheduler gets injected (via Mutating Webhook), and Kubernetes Events for Pre-Deployment are sent to the event stream. The Event Controller watches for events and triggers a Kubernetes Job to fullfil the Pre-Deployment. After the Pre-Deployment has finished, the Keptn Scheduler schedules the Pod to be deployed. The KeptnApp and KeptnWorkload Controllers watch for the workload resources to finish and then generate a Post-Deployment Event. After the Post-Deployment checks, SLOs can be validated using an interface for retrieving SLI data from a provider, e.g, Prometheus. Finally, Keptn Lifecycle Toolkit exposes Metrics and Traces of the whole Deployment cycle with OpenTelemetry.
The following sections will provide insights on each component of the Keptn Lifecycle Toolkit in terms of their purpose, responsibility, and communication with other components. Furthermore, there will be a description on what CRD they monitor and a general overview of their fields.
Annotating a namespace subjects it to the effects of the mutating webhook:
apiVersion: v1
kind: Namespace
metadata:
name: podtato-kubectl
annotations:
keptn.sh/lifecycle-toolkit: "enabled" # this lines tells the webhook to handle the namespace
However, the mutating webhook will modify only resources in the annotated namespace that have Keptn annotations. When the webhook receives a request for a new pod, it will look for the workload annotations:
keptn.sh/workload: "some-workload-name"
The mutation consists in changing the scheduler used for the deployment with the Keptn Scheduler. Webhook then creates a workload and app resource per annotated resource. You can also specify a custom app definition with the annotation:
keptn.sh/app: "your-app-name"
In this case the webhook will not generate an app, but it will expect that the user will provide one.
The webhook should be as fast as possible and should not create/change any resource.
Additionally, it will compute a version string, using a hash function that takes certain properties of the pod as
parameters
(e.g. the images of its containers).
Next, it will look for an existing instance of a Workload CRD
for the given workload name:
- If it finds the
Workload
, it will update its version according to the previously computed version string. In addition, it will include a reference to the ReplicaSet UID of the pod (i.e. the Pods owner), or the pod itself, if it does not have an owner. - If it does not find a workload instance, it will create one containing the previously computed version string. In addition, it will include a reference to the ReplicaSet UID of the pod (i.e. the Pods owner), or the pod itself, if it does not have an owner.
It will use the following annotations for
the specification of the pre/post deployment checks that should be executed for the Workload
:
keptn.sh/pre-deployment-tasks: task1,task2
keptn.sh/post-deployment-tasks: task1,task2
and for the Evaluations:
keptn.sh/pre-deployment-evaluations: my-evaluation-definition
keptn.sh/post-deployment-evaluations: my-eval-definition
After either one of those actions has been taken, the webhook will set the scheduler of the pod and allow the pod to be scheduled.
After the Webhook mutation, the Keptn-Scheduler will handle the annotated resources. The scheduling flow follows the default scheduler behavior, since it implements a scheduler plugin based on the scheduling framework. For each pod, at the very end of the scheduling cycle, the plugin verifies whether the pre deployment checks have terminated, by retrieving the current status of the WorkloadInstance. Only if that is successful, the pod is bound to a node.
An App contains information about all workloads and checks associated with an application. It will use the following structure for the specification of the pre/post deployment and pre/post evaluations checks that should be executed at app level:
apiVersion: lifecycle.keptn.sh/v1alpha3
kind: KeptnApp
metadata:
name: podtato-head
namespace: podtato-kubectl
spec:
version: "1.3"
revision: 1
workloads:
- name: podtato-head-left-arm
version: 0.1.0
- name: podtato-head-left-leg
version: 1.2.3
preDeploymentTasks:
- pre-deployment-hello
postDeploymentTasks:
- post-deployment-hello
preDeploymentEvaluations:
- my-prometheus-definition
postDeploymentEvaluations:
- my-prometheus-definition
While changes in the workload version will affect only workload checks, a change in the app version will also cause a new execution of app level checks.
A Workload contains information about which tasks should be performed during the preDeployment
as well as
the postDeployment
phase of a deployment.
In its state it keeps track of the currently active Workload Instances
, which are responsible
for doing those checks for
a particular instance of a Deployment/StatefulSet/ReplicaSet (e.g. a Deployment of a certain version).
A Workload Instance is responsible for executing the pre- and post deployment checks of a workload.
In its state, it
keeps track of the current status of all checks, as well as the overall state of
the Pre Deployment phase, which can be used by the scheduler to tell that a pod can be allowed to be placed on a node.
Workload Instances have a reference to the respective Deployment/StatefulSet/ReplicaSet, to check if it has reached the
desired state.
If it detects that the referenced object has reached
its desired state (e.g. all pods of a deployment are up and running), it will be able to tell that
a PostDeploymentCheck
can be triggered.
A KeptnTaskDefinition
is a CRD used to define tasks that can be run by the Keptn Lifecycle Toolkit
as part of pre- and post-deployment phases of a deployment.
KeptnTaskDefinition
resource can be created in the namespace where the application is running, or
in the default KLT namespace, which will be the fallback option for the system to search.
The task definition is a Deno script
Please, refer to the function runtime folder for more information about the runtime.
In the future, we also intend to support other runtimes, especially running a container image directly.
A task definition can be configured in three different ways:
- inline
- referring to an HTTP script
- referring to another
KeptnTaskDefinition
An inline KeptnTaskDefinition looks like the following:
apiVersion: lifecycle.keptn.sh/v1alpha3
kind: KeptnTaskDefinition
metadata:
name: deployment-hello
spec:
function:
inline:
code: |
console.log("Deployment Task has been executed");
In the code section, it is possible to define a full-fletched Deno script. A further example, is available here.
To runtime can also fetch the script on the fly from a remote webserver. For this, the CRD should look like the following:
apiVersion: lifecycle.keptn.sh/v1alpha3
kind: KeptnTaskDefinition
metadata:
name: hello-keptn-http
spec:
function:
httpRef:
url: <url>
An example is available here.
Finally, KeptnTaskDefinition
can build on top of other KeptnTaskDefinition
s.
This is a common use case where a general function can be re-used in multiple places with different parameters.
apiVersion: lifecycle.keptn.sh/v1alpha3
kind: KeptnTaskDefinition
metadata:
name: slack-notification-dev
spec:
function:
functionRef:
name: slack-notification
parameters:
map:
textMessage: "This is my configuration"
secureParameters:
secret: slack-token
As you might have noticed, Task Definitions also have the possibility to use input parameters.
The Lifecycle Toolkit passes the values defined inside the map
field as a JSON object.
At the moment, multi-level maps are not supported.
The JSON object can be read through the environment variable DATA
using Deno.env.get("DATA");
.
K8s secrets can also be passed to the function using the secureParameters
field.
Here, the secret
value is the K8s secret name that will be mounted into the runtime and made available to the function
via the environment variable SECURE_DATA
.
A Task is responsible for executing the TaskDefinition of a workload. The execution is done spawning a K8s Job to handle a single Task. In its state, it keeps track of the current status of the K8s Job created.
A KeptnEvaluationDefinition
is a CRD used to define evaluation tasks that can be run by the Keptn Lifecycle Toolkit
as part of pre- and post-analysis phases of a workload or application.
KeptnEvaluationDefinition
resource can be created in the namespace where the application is running, or
in the default KLT namespace, which will be the fallback option for the system to search.
A KeptnEvaluationDefinition looks like the following:
apiVersion: lifecycle.keptn.sh/v1alpha3
kind: KeptnEvaluationDefinition
metadata:
name: my-prometheus-evaluation
spec:
objectives:
- keptnMetricRef:
name: cpu
namespace: ns
evaluationTarget: <20
- keptnMetricRef:
name: memory
namespace: ns
evaluationTarget: >4
A KeptnMetricsProvider
CRD defines the evaluation provider that provides data for the
pre- and post-analysis phases of a Keptn workload or Keptn application.
A KeptnMetricsProvider definition looks like the following:
apiVersion: metrics.keptn.sh/v1alpha2
kind: KeptnMetricsProvider
metadata:
name: prometheus
spec:
targetServer: "http://prometheus-k8s.monitoring.svc.cluster.local:9090"
secretKeyRef:
key: prometheusLoginCredentials
Note The KeptnMetricsProvider is a new resource in KLT 0.7.0. The migration documentation provides information about how to upgrade from 0.6.0 and earlier versions to 0.7.0.
A KeptnMetric
CRD defines the SLI provider with a query and stores metric data fetched from the provider.
Providing the metrics as CRD into a Kubernetes cluster makes this data reusable across multiple projects or workloads.
Furthermore, this allows using multiple observability platforms for different metrics.
A KeptnMetric definition looks like the following:
apiVersion: metrics.keptn.sh/v1alpha2
kind: KeptnMetric
metadata:
name: keptnmetric-sample
namespace: keptn-lifecycle-toolkit-system
spec:
provider:
name: "prometheus"
query: "sum(kube_pod_container_resource_limits{resource='cpu'})"
fetchIntervalSeconds: 5
To use KeptnMetric
as part of your evaluation, you must set the
.spec.objectives[i].keptnMetricRef.name
and .spec.objectives[i].keptnMetricRef.namespace
of
KeptnEvaluationDefiniton
resource to the same value that is stored in .metadata.name
and metadata.namespace
of the KeptnMetric
resource.
Specifying the .spec.objectives[i].keptnMetricRef.namespace
is optional.
If it's not specified, KLT searches for the KeptnMetric
resource in the namespace where KeptnEvaluationDefinition
resource is stored.
If the KeptnMetric
resource cannot be found there, it searches in the default KLT namespace (keptn-lifecycle-toolkit-system
).
Note Please be aware that, if the
.spec.objectives[i].keptnMetricRef.namespace
ofKeptnEvaluationDefinition
resource is specified and theKeptnMetric
resource does not exist in this namespace, the evaluation fails.
The KeptnConfig
is a CRD defines configuration values for the Keptn Lifecycle Toolkit.
Currently, it can be used to configure the URL of the OpenTelemetry collector.
Additionally, it can be used to set the time interval in which automatic app discovery
searches for workloads to put into the same auto-generated KeptnApp
.
When the parameter is not set, the default value is 30 seconds.
A KeptnConfig
looks like the following:
apiVersion: options.keptn.sh/v1alpha1
kind: KeptnConfig
metadata:
name: keptnconfig-sample
spec:
OTelCollectorUrl: 'otel-collector:4317'
keptnAppCreationRequestTimeoutSeconds: 30
The GitHub CLI can be used to download the manifests of the latest CI build.
gh run list --repo keptn/lifecycle-toolkit # find the id of a run
gh run download 3152895000 --repo keptn/lifecycle-toolkit # download the artifacts
kubectl apply -f ./keptn-lifecycle-operator-manifest/release.yaml # install the lifecycle-operator
kubectl apply -f ./scheduler-manifest/release.yaml # install the scheduler
Instead, if you want to build and deploy the operator into your cluster directly from the code, you can type:
RELEASE_REGISTRY=<YOUR_DOCKER_REGISTRY>
# (optional)ARCH=<amd64(default)|arm64v8>
# (optional)CHART_APPVERSION=<YOUR_PREFERRED_TAG (defaulting to current time)>
# Build and deploy the dev images to the current kubernetes cluster
make build-deploy-dev-environment
Please find more information in the LICENSE file.
Made with contrib.rocks.