diff --git a/docs/content/en/docs/implementing/assets/dynatrace_dora_dashboard.png b/docs/content/en/docs/implementing/assets/dynatrace_dora_dashboard.png new file mode 100644 index 0000000000..3318a4303f Binary files /dev/null and b/docs/content/en/docs/implementing/assets/dynatrace_dora_dashboard.png differ diff --git a/docs/content/en/docs/implementing/otel.md b/docs/content/en/docs/implementing/otel.md index 88037c198e..2a03de325f 100644 --- a/docs/content/en/docs/implementing/otel.md +++ b/docs/content/en/docs/implementing/otel.md @@ -4,26 +4,177 @@ description: How to standardize access to OpenTelemetry observability data weight: 140 --- + +The Keptn Lifecycle Toolkit (KLT) makes any Kubernetes deployment observable. +In other words, it creates a distributed, end-to-end trace +of what Kubernetes does in the context of a Deployment. +To do this, +Keptn introduces the concept of an `application`, +which is an abstraction that connects multiple +Workloads that logically belong together, +even if they use different deployment strategies. + +This means that: + +- You can readily see why a deployment takes so long + or why it fails, even when using multiple deployment strategies. +- KLT can capture DORA metrics and expose them as OpenTelemetry metrics + +The observability data is an amalgamation of the following: + +- DORA metrics are collected out of the box + when the Lifecycle Toolkit is enabled +- OpenTelemetry runs traces that show + everything that happens in the Kubernetes cluster +- Custom Keptn metrics that you can use to monitor + information from all the data providers configured in your cluster + +All this information can be displayed with dashboard tools +such as Grafana. + +For an introduction to using OpenTelemetry with Keptn metrics, see the +[Standardize observability](../getting-started/observability) +getting started guide. + +## DORA metrics + +DORA metrics are an industry-standard set of measurements; +see the following for a description: + +- [What are DORA Metrics and Why Do They Matter?](https://codeclimate.com/blog/dora-metrics) +- [Are you an Elite DevOps Performer? + Find out with the Four Keys Project](https://cloud.google.com/blog/products/devops-sre/using-the-four-keys-to-measure-your-devops-performance) + +DORA metrics provide information such as: + +- How many deployments happened in the last six hours? +- Time between deployments +- Deployment time between versions +- Average time between versions. + +The Keptn Lifecycle Toolkit starts collecting these metrics +as soon as you apply +[basic annotations](integrate/#basic-annotations) +to the Workload resource. +Metrics are collected only for the resources +that are annotated. + +To view DORA metrics, run the following command: + +```shell +kubectl port-forward -n keptn-lifecycle-toolkit-system \ + svc/lifecycle-operator-metrics-service 2222 +``` + +Then view the metrics at: + +```shell +http://localhost:2222/metrics +``` + +DORA metrics can be displayed on Grafana +or whatever dashboard application you choose. + +## OpenTelemetry + +### Requirements for OpenTelemetry + To access OpenTelemetry metrics with the Keptn Lifecycle Toolkit, -you must: +you must have the following on your cluster: -- Install an OpenTelemetry collector on your cluster. +- An OpenTelemetry collector. See [OpenTelemetry Collector](https://opentelemetry.io/docs/collector/) for more information. +- Prometheus Operator. + See [Prometheus Operator Setup](https://github.com/prometheus-operator/kube-prometheus/blob/main/docs/customizing.md). +- The Prometheus Operator must have the required permissions + to watch resources of the `keptn-lifecycle-toolkit-system` namespace (see + [Setup for Monitoring other Namespaces](https://prometheus-operator.dev/docs/kube/monitoring-other-namespaces/)). + +If you want a dashboard for reviewing metrics and traces, +you need: + +- [Grafana](https://grafana.com/) + or the dashboard of your choice. + See + [Grafana Setup](https://grafana.com/docs/grafana/latest/setup-grafana/). + +- [Jaeger](https://jaegertracing.io) + or a similar tool if you want traces. + See + [Jaeger Setup](https://github.com/jaegertracing/jaeger-operator#getting-started). + +To install Prometheus into the `monitoring` namespace, +using the default configuration included with KLT, +use the following commands. +Use similar commands if you define a different configuration:: + +```shell +kubectl create namespace monitoring +kubectl apply --server-side -f config/prometheus/setup +kubectl apply -f config/prometheus/ +``` + +### Integrate OpenTelemetry into the Keptn Lifecycle Toolkit + +To integrate OpenTelementry into the Keptn Lifecycle Toolkit: + - Apply [basic annotations](../implementing/integrate/#basic-annotations) for your `Deployment` resource to integrate the Lifecycle Toolkit into your Kubernetes cluster. +- To expose OpenTelemetry metrics, + define a [KeptnConfig](../yaml-crd-ref/config.md) resource + that has the `spec.OTelCollectorUrl` field populated + with the URL of the OpenTelemetry collector. + +The +[otel-collector.yaml](https://github.com/keptn/lifecycle-toolkit/blob/main/examples/support/observability/config/otel-collector.yaml) +is the OpenTelementry manifest file for the PodtatoHead example, +located in the `config` directory. +To deploy and configure the OpenTelemetry collector +using this manifest, the command is: + +```shell +kubectl apply -f config/otel-collector.yaml \ + -n keptn-lifecycle-toolkit-system +``` + +Use the following command to confirm that the pod +for the `otel-collector` deployment is up and running: + +```shell +$ kubectl get pods -lapp=opentelemetry \ + -n keptn-lifecycle-toolkit-system + +NAME READY STATUS RESTARTS AGE +otel-collector-6fc4cc84d6-7hnvp 1/1 Running 0 92m +``` + +If you want to extend the OTel Collector configuration +to send your telemetry data to other Observability platform, +you can edit the Collector `ConfigMap` with the following command: + +```shell +kubectl edit configmap otel-collector-conf \ + -n keptn-lifecycle-toolkit-system +``` + +When the `otel-collector` pod is up and running, +restart the `keptn-scheduler` and `lifecycle-operator` +so they can pick up the new configuration: + +```shell +kubectl rollout restart deployment \ + -n keptn-lifecycle-toolkit-system keptn-scheduler lifecycle-operator +``` KLT begins to collect OpenTelemetry metrics as soon as the `Deployment` resource has the basic annotations to integrate KLT in the cluster. -To expose OpenTelemetry metrics, -define a [KeptnConfig](../yaml-crd-ref/config.md) resource -that has the `spec.OTelCollectorUrl` field populated -with the URL of the OpenTelemetry collector. +## Access Keptn metrics as OpenTelemetry metrics Keptn metrics can be exposed as OpenTelemetry (OTel) metrics via port `9999` of the KLT metrics-operator. @@ -35,7 +186,3 @@ kubectl port-forward deployment/metrics-operator 9999 -n keptn-lifecycle-toolkit ``` You can access the metrics from your browser at: `http://localhost:9999` - -For an introduction to using OpenTelemetry with Keptn metrics, see the -[Standardize observability](../getting-started/observability) -getting started guide. diff --git a/docs/content/en/docs/install/k8s.md b/docs/content/en/docs/install/k8s.md index a1a5195fda..1a42f76c1d 100644 --- a/docs/content/en/docs/install/k8s.md +++ b/docs/content/en/docs/install/k8s.md @@ -81,17 +81,16 @@ Your cluster should include the following: Alternatively, KLT also works with just `kubctl apply` for deployment. * If you want to use the standardized observability feature, - install an OpenTelemetry collector on your cluster. - See - [OpenTelemetry Collector](https://opentelemetry.io/docs/collector/) - for more information. + you must have an OpenTelemetry collector + and a Prometheus operator installed on your cluster. -* If you want a dashboard for reviewing metrics and traces, - Install [Grafana](https://grafana.com/) - or the dashboard of your choice. + If you want a dashboard for reviewing metrics and traces, + install Grafana or the dashboard of your choice. -* For traces, install [Jaeger](https://jaegertracing.io) - or a similar tool. + For traces, install Jaeger or a similar tool. + + For more information, see + [Requirements for Open Telemetry](../implementing/otel.md/#requirements-for-opentelemetry). Also note that the Keptn Lifecycle Toolkit includes a light-weight cert-manager that, by default, is installed diff --git a/examples/support/observability/README.md b/examples/support/observability/README.md index ec6e9579c1..14c4ca88eb 100644 --- a/examples/support/observability/README.md +++ b/examples/support/observability/README.md @@ -1,97 +1,43 @@ # Sending Traces and Metrics to the OpenTelemetry Collector -In this example, we will show you an example configuration for enabling the operator to send OpenTelemetry traces and -metrics to the [OpenTelemetry Collector](https://github.com/open-telemetry/opentelemetry-collector). -The Collector will then be used to forward the gathered data to [Jaeger](https://www.jaegertracing.io) +In this example, we will show you an example configuration +for enabling the operator to send OpenTelemetry traces and metrics to the +[OpenTelemetry Collector](https://github.com/open-telemetry/opentelemetry-collector). +The Collector will then be used to forward the gathered data to +[Jaeger](https://www.jaegertracing.io) and [Prometheus](https://prometheus.io). -The application deployed uses an example of pre-Deployment Evaluation based on prometheus metrics. +The application deployed uses an example of pre-Deployment Evaluation +based on prometheus metrics. -## TL;DR +- To install the whole demo, including Keptn-lifecycle-toolkit, + execute the following command: -* You can install the whole demo including Keptn-lifecycle-toolkit using: `make install` -* Deploy the PodTatoHead Demo Application: `make deploy-podtatohead` -* Afterward, see it in action as defined here: [OpenTelemetry in Action](#seeing-the-opentelemetry-collector-in-action) + ```shell + make install + ``` -## Prerequisites +- Deploy the PodTatoHead Demo Application: `make deploy-podtatohead` +- Afterward, see it in action as defined in + [OpenTelemetry in Action](#seeing-the-opentelemetry-collector-in-action) -This tutorial assumes, that you already installed the Keptn Lifecycle Controller ( -see ). -The installation instructions can be -found [here](https://github.com/keptn/lifecycle-toolkit#deploy-the-latest-release). -As well, you have both Jaeger and the Prometheus Operator installed in your Cluster. -Also, please ensure that the Prometheus Operator has the required permissions to watch resources of -the `keptn-lifecycle-toolkit-system` namespace ( -see as a reference). -For setting up both Jaeger and Prometheus, please refer to their docs: - -* [Jaeger Setup](https://github.com/jaegertracing/jaeger-operator) -* [Prometheus Operator Setup](https://github.com/prometheus-operator/kube-prometheus/blob/main/docs/customizing.md) - -If you don't have an already existing installation of -Jaeger [manifest](https://github.com/jaegertracing/jaeger-operator/releases/download/v1.38.0/jaeger-operator.yaml) or -Prometheus, you can run these commands to -have a basic installation up and running. - -```shell -# Install Jaeger into the observability namespace and the Jaeger resource into the lifecycle-toolkit namespace -kubectl create namespace observability -kubectl apply -f https://github.com/jaegertracing/jaeger-operator/releases/download/v1.38.0/jaeger-operator.yaml -n observability -kubectl apply -f config/jaeger.yaml -n keptn-lifecycle-toolkit-system - -# Install Prometheus -kubectl create namespace monitoring -kubectl apply --server-side -f config/prometheus/setup -kubectl apply -f config/prometheus/ -``` - -With these commands, the Jaeger and Prometheus Operator will be installed in the `observability` and `monitoring` -namespaces, respectively. - -## Configuring the OpenTelemetry Collector and Prometheus ServiceMonitor - -Once Jaeger and Prometheus are installed, you can deploy and configure the OpenTelemetry collector using the manifests -in the `config` directory: - -```shell -kubectl apply -f config/otel-collector.yaml -n keptn-lifecycle-toolkit-system -``` - -Also, please ensure that the `OTEL_COLLECTOR_URL` env vars of both the `lifecycle-operator`, -as well as the `keptn-scheduler` deployments are set appropriately. -By default, they are set to `otel-collector:4317`, which should be the correct value for this tutorial. - -Eventually, there should be a pod for the `otel-collector` deployment up and running: - -```shell -$ kubectl get pods -lapp=opentelemetry -n keptn-lifecycle-toolkit-system - -NAME READY STATUS RESTARTS AGE -otel-collector-6fc4cc84d6-7hnvp 1/1 Running 0 92m -``` - -If you want to extend the OTel Collector configuration to send your telemetry data to other Observability platform, you -can edit the Collector ConfigMap with the following command: - -```shell -kubectl edit configmap otel-collector-conf -n keptn-lifecycle-toolkit-system -``` - -When the `otel-collector` pod is up and running, restart the `keptn-scheduler` and `lifecycle-operator` so they can -pick up the new configuration. - -```shell -kubectl rollout restart deployment -n keptn-lifecycle-toolkit-system keptn-scheduler lifecycle-operator -``` +For information about installing and configuring +the software required, see +[OpenTelemetry observability](../../../docs/content/en/docs/implementing/otel.md/) +in the documentation. ## Seeing the OpenTelemetry Collector in action -After everything has been set up, use the lifecycle operator to deploy a workload (e.g. using the `single-service` +After everything has been set up, use the lifecycle operator +to deploy a workload (e.g. using the `single-service` or `podtato-head` example in the `examples` folder). -To showcase pre-Evaluation checks we created a new version of podtato-head app in +To showcase pre-Evaluation checks, +we created a new version of podtato-head app in assets/podtetohead-deployment-evaluation. -You can run ``make deploy-podtatohead`` to check pre-Evaluations of prometheus metrics both at app and workload instance -level. -Once an example has been deployed, you can view the generated traces in Jaeger. +You can run ``make deploy-podtatohead`` +to check pre-Evaluations of prometheus metrics +both at app and workload instance level. +Once an example has been deployed, +you can view the generated traces in Jaeger. To do so, please create a port-forward for the `jaeger-query` service: @@ -99,9 +45,10 @@ for the `jaeger-query` service: kubectl port-forward -n keptn-lifecycle-toolkit-system svc/jaeger-query 16686 ``` -Afterwards, you can view the Jaeger UI in the browser at [localhost:16686](http://localhost:16686). -There you should see -the traces generated by the lifecycle controller, which should look like this: +Afterwards, you can view the Jaeger UI in the browser at +[localhost:16686](http://localhost:16686). +There you should see the traces generated by the lifecycle controller, +which should look like this: ### Traces overview @@ -111,35 +58,44 @@ the traces generated by the lifecycle controller, which should look like this: ![Screenshot of a trace in Jaeger](./assets/trace_detail.png) -In Prometheus, do a port forward to the prometheus service inside your cluster (the exact name and namespace of the -prometheus service will depend on your Prometheus setup - we are using the defaults that come with the example of the -Prometheus Operator tutorial). +In Prometheus, do a port forward to the prometheus service +inside your cluster (the exact name and namespace of the +prometheus service will depend on your Prometheus setup; +we are using the defaults that come with +the example of the Prometheus Operator tutorial). ```shell kubectl -n monitoring port-forward svc/prometheus-k8s 9090 ``` -Afterwards, you can view the Prometheus UI in the browser at [localhost:9090](http://localhost:9090). -There, in -the [Targets](http://localhost:9090/targets?search=) section, you should see an entry for the otel-collector: +Afterwards, you can view the Prometheus UI in the browser at +[localhost:9090](http://localhost:9090). +There, in the +[Targets](http://localhost:9090/targets?search=) section, +you should see an entry for the otel-collector: ![Screenshot of a target in Prometheus](./assets/prometheus_targets.png) -Also, in the [Graph](http://localhost:9090/graph?g0.expr=&g0.tab=1&g0.stacked=0&g0.show_exemplars=0&g0.range_input=1h) -section, you can retrieve metrics reported by the Keptn Lifecycle Controller (all of the available metrics start with -the `keptn` prefix): +Also, in the +[Graph](http://localhost:9090/graph?g0.expr=&g0.tab=1&g0.stacked=0&g0.show_exemplars=0&g0.range_input=1h) +section, you can retrieve metrics reported by the Keptn Lifecycle Controller +(all of the available metrics start with the `keptn` prefix): ![Screenshot of the auto-complete menu in a Prometheus query](./assets/metrics.png) -To view the exported metrics in Grafana, we have provided dashboards which have been automatically installed with this -example. -To display them, please first create a port-forward for the `grafana` service in the `monitoring` namespace: +To view the exported metrics in Grafana, +we have provided dashboards +that have been automatically installed with this example. +To display them, +create a port-forward for the `grafana` service +in the `monitoring` namespace: ```shell make port-forward-grafana ``` -Now, you should be able to see it in the [Grafana UI](http://localhost:3000/d/wlo2MpIVk/keptn-lifecycle-toolkit-metrics) +Now, you should be able to see it in the +[Grafana UI](http://localhost:3000/d/wlo2MpIVk/keptn-lifecycle-toolkit-metrics) under `Dashboards > General`. ![Screenshot of a dashboard in Grafana](./assets/grafana_dashboard.png)