kyma-project · kyma-bot · Jan 12, 2024 · Jan 9, 2024 · Jan 10, 2024 · Jan 10, 2024
@@ -1,4 +1,4 @@
-# 1. Fluent Bit Configuration and File-System Buffer Usage
+# 2. Fluent Bit Configuration and File-System Buffer Usage
 
 Date: 2023-11-23
 

@@ -0,0 +1,51 @@
+# 3. Integrate Prometheus With Telemetry Manager Using Alerting
+
+Date: 2024-01-11
+
+## Status
+
+Accepted
+
+## Context
+
+As outlined in [ADR 001: Trace/Metric Pipeline status based on OTel Collector metrics](./001-otel-collector-metric-based-pipeline-status.md), our objective is to utilize a managed Prometheus instance to reflect specific telemetry flow issues (such as backpressure, data loss, backend unavailability) in the status of a telemetry pipeline custom resource (CR).
+We have previously determined that both Prometheus and its configuration will be managed within the Telemetry Manager's code, aligning with our approach for managing Fluent Bit and OTel Collector.
+
+To address the integration of Prometheus querying into the reconciliation loop, a Proof of Concept was executed.
+
+## Decision
+
+The results of the query tests affirm that invoking Prometheus APIs won't notably impact the overall reconciliation time. In theory, we could directly query Prometheus within the Reconcile routine. However, this straightforward approach presents a few challenges.
+
+### Challenges
+
+#### Timing of Invocation
+Our current reconciliation strategy triggers either when a change occurs or every minute. While this is acceptable for periodic status updates, it may not be optimal when considering future plans to use Prometheus for autoscaling decisions.
+
+#### Flakiness Mitigation
+To ensure reliability and avoid false alerts, it's crucial to introduce a delay before signaling a problem. As suggested in [OTel Collector monitoring best practices](https://github.com/open-telemetry/opentelemetry-collector/blob/main/docs/monitoring.md):
+
+> Use the rate of otelcol_processor_dropped_spans > 0 and otelcol_processor_dropped_metric_points > 0 to detect data loss. Depending on requirements, set up a minimal time window before alerting to avoid notifications for minor losses that fall within acceptable levels of reliability.
+
+If we directly query Prometheus, we would need to implement such a mechanism to mitigate flakiness ourselves.
+
+### Solution
+
+Fortunately, we can leverage the Alerting feature of Prometheus to address the aforementioned challenges. The proposed workflow is as follows:
+
+#### Rendering Alerting Rules
+Telemetry Manager dynamically generates alerting rules based on the deployed pipeline configuration.
+These alerting rules are then mounted into the Prometheus Pod, which is also deployed by the Telemetry Manager.
+
+#### Alert Retrieval in Reconciliation
+During each reconciliation iteration, the Telemetry Manager queries the [Prometheus Alerts API](https://prometheus.io/docs/prometheus/latest/querying/api/#alerts) using `github.com/prometheus/client_golang` to retrieve information about all fired alerts.
+The obtained alerts are then translated into corresponding CR statuses.
+
+#### Webhook for Immediate Reconciliation
+The Telemetry Manager exposes an endpoint intended to be invoked by Prometheus whenever there is a change in the state of alerts. To facilitate this, we can configure Prometheus to treat our endpoint as an Alertmanager instance. Upon receiving a call, this endpoint initiates an immediate reconciliation of all affected resources using the https://pkg.go.dev/sigs.k8s.io/controller-runtime/pkg/builder#Builder.WatchesRawSource with https://pkg.go.dev/sigs.k8s.io/[email protected]/pkg/source#Channel.
+
+By adopting this approach, we transfer the effort associated with expression evaluation and waiting to Prometheus.
+
+## Consequences
+
+The described setup involves a lot of interaction between Telemetry Manager and Prometheus, which should be sufficiently monitored.
@@ -0,0 +1,256 @@
+# Integrate Prometheus With Telemetry Manager Using Alerting
+
+## Goal
+
+The goal of the Proof of Concept is to test integrating Prometheus into Telemetry Manager using Alerting.
+
+## Setup
+
+Follow these steps to set up the required environment:
+
+1. Create a Kubernetes cluster (k3d or Gardener).
+2. Create an overrides file specifically for the Prometheus Helm Chart. Save the file as `overrides.yaml`.
+   ```yaml
+    alertmanager:
+      enabled: false
+
+    prometheus-pushgateway:
+      enabled: false
+
+    prometheus-node-exporter:
+      enabled: false
+
+    server:  
+      alertmanagers:
+      - static_configs:
+        - targets:
+          - telemetry-operator-alerts-webhook.kyma-system:9090
+
+    serverFiles:
+      alerting_rules.yml:
+       groups:
+         - name: Instances
+           rules:
+             - alert: InstanceDown
+               expr: up == 0
+               for: 5m
+               labels:
+                 severity: page
+               annotations:
+                 description: '{{ $labels.instance }} of job {{ $labels.job }} has been down for more than 5 minutes.'
+                 summary: 'Instance {{ $labels.instance }} down' 
+      prometheus.yml:
+        rule_files:
+          - /etc/config/recording_rules.yml
+          - /etc/config/alerting_rules.yml
+
+        scrape_configs:
+          - job_name: prometheus
+            static_configs:
+              - targets:
+                - localhost:9090
+
+          - job_name: 'kubernetes-service-endpoints'
+            honor_labels: true
+            kubernetes_sd_configs:
+              - role: endpoints
+            relabel_configs:
+              - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape]
+                action: keep
+                regex: true
+              - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape_slow]
+                action: drop
+                regex: true
+              - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scheme]
+                action: replace
+                target_label: __scheme__
+                regex: (https?)
+              - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]
+                action: replace
+                target_label: __metrics_path__
+                regex: (.+)
+              - source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]
+                action: replace
+                target_label: __address__
+                regex: (.+?)(?::\d+)?;(\d+)
+                replacement: $1:$2
+              - action: labelmap
+                regex: __meta_kubernetes_service_annotation_prometheus_io_param_(.+)
+                replacement: __param_$1
+              - action: labelmap
+                regex: __meta_kubernetes_service_label_(.+)
+              - source_labels: [__meta_kubernetes_namespace]
+                action: replace
+                target_label: namespace
+              - source_labels: [__meta_kubernetes_service_name]
+                action: replace
+                target_label: service
+              - source_labels: [__meta_kubernetes_pod_node_name]
+                action: replace
+                target_label: node
+   ```
+3. Deploy Prometheus.
+   ```shell
+    kubectl create ns prometheus
+    helm install -f overrides.yaml  prometheus prometheus-community/prometheus
+   ```
+4. Create an endpoint in Telemetry Manager to be invoked by Prometheus:
+   ```go
+    reconcileTriggerChan := make(chan event.GenericEvent, 1024)
+    go func() {
+        handler := func(w http.ResponseWriter, r *http.Request) {
+            body, readErr := io.ReadAll(r.Body)
+            if readErr != nil {
+                http.Error(w, "Error reading request body", http.StatusInternalServerError)
+                return
+            }
+            defer r.Body.Close()
+
+            // TODO: add more context about which objects have to reconciled
+            reconcileTriggerChan <- event.GenericEvent{}
+            w.WriteHeader(http.StatusOK)
+        }
+
+        mux := http.NewServeMux()
+        mux.HandleFunc("/api/v2/alerts", handler)
+
+        server := &http.Server{
+            Addr:              ":9090",
+            ReadHeaderTimeout: 10 * time.Second,
+            Handler:           mux,
+        }
+
+        if serverErr := server.ListenAndServe(); serverErr != nil {
+            mutex.Lock()
+            setupLog.Error(serverErr, "Cannot start webhook server")
+            mutex.Unlock()
+        }
+    }()
+   ```
+5. Trigger reconciliation in MetricPipelineController whenever the endpoint is called by Prometheus:
+   ```go
+    func NewMetricPipelineReconciler(client client.Client, reconcileTriggerChan chan event.GenericEvent, reconciler *metricpipeline.Reconciler) *MetricPipelineReconciler {
+        return &MetricPipelineReconciler{
+            Client:     client,
+            reconciler: reconciler,
+            Client:      client,
+            reconciler:  reconciler,
+            reconcileTriggerChan: reconcileTriggerChan,
+        }
+    }
+
+    // SetupWithManager sets up the controller with the Manager.
+    func (r *MetricPipelineReconciler) SetupWithManager(mgr ctrl.Manager) error {
+        // We use `Watches` instead of `Owns` to trigger a reconciliation also when owned objects without the controller flag are changed.
+        return ctrl.NewControllerManagedBy(mgr).
+                For(&telemetryv1alpha1.MetricPipeline{}).
+                WatchesRawSource(&source.Channel{Source: r.reconcileTriggerChan},
+                handler.EnqueueRequestsFromMapFunc(r.mapPrometheusAlertEvent)).
+            ...
+    }
+
+    func (r *MetricPipelineReconciler) mapPrometheusAlertEvent(ctx context.Context, _ client.Object) []reconcile.Request {
+        logf.FromContext(ctx).Info("Handling Prometheus alert event")
+        requests, err := r.createRequestsForAllPipelines(ctx)
+        if err != nil {
+        logf.FromContext(ctx).Error(err, "Unable to create reconcile requests")
+        }
+        return requests
+    }
+   ```
+6. Query Prometheus alerts in the Reconcile function:
+   ```go
+    import (
+        "context"
+        "fmt"
+        "time"
+
+        "github.com/prometheus/client_golang/api"
+        promv1 "github.com/prometheus/client_golang/api/prometheus/v1"
+        logf "sigs.k8s.io/controller-runtime/pkg/log"
+    )
+
+    const prometheusAPIURL = "http://prometheus-server.default:80"
+
+    func queryAlerts(ctx context.Context) error {
+        client, err := api.NewClient(api.Config{
+            Address: prometheusAPIURL,
+        })
+        if err != nil {
+            return fmt.Errorf("failed to create Prometheus client: %w", err)
+        }
+
+        v1api := promv1.NewAPI(client)
+        ctx, cancel := context.WithTimeout(ctx, 10*time.Second)
+        defer cancel()
+
+        start := time.Now()
+        alerts, err := v1api.Alerts(ctx)
+
+        if err != nil {
+            return fmt.Errorf("failed to query Prometheus alerts: %w", err)
+        }
+
+        logf.FromContext(ctx).Info("Prometheus alert query succeeded!",
+            "elapsed_ms", time.Since(start).Milliseconds(),
+            "alerts", alerts)
+        return nil
+    }
+   ```
+
+7. Add a Kubernetes service for the alerts endpoint to the kustomize file:
+   ```yaml
+    apiVersion: v1
+    kind: Service
+    metadata:
+      name: operator-alerts-webhook
+      namespace: system
+    spec:
+      ports:
+        - name: webhook
+          port: 9090
+          targetPort: 9090
+      selector:
+        app.kubernetes.io/name: operator
+        app.kubernetes.io/instance: telemetry
+        kyma-project.io/component: controller
+        control-plane: telemetry-operator
+   ```
+8. Whitelist the endpoint port (9090) in the operator network policy:
+   ```yaml
+    apiVersion: networking.k8s.io/v1
+    kind: NetworkPolicy
+    metadata:
+      name: operator-pprof-deny-ingress
+    spec:
+      podSelector:
+        matchLabels:
+          app.kubernetes.io/name: operator
+          app.kubernetes.io/instance: telemetry
+          kyma-project.io/component: controller
+          control-plane: telemetry-operator
+      policyTypes:
+        - Ingress
+      ingress:
+        - from:
+            - ipBlock:
+                cidr: 0.0.0.0/0
+          ports:
+            - protocol: TCP
+              port: 8080
+            - protocol: TCP
+              port: 8081
+            - protocol: TCP
+              port: 9443
+            - protocol: TCP
+              port: 9090
+   ```
+9. Deploy the modified Telemetry Manager:
+   ```shell
+    export IMG=$DEV_IMAGE_REPO
+    make docker-build
+    make docker-push
+    make install
+    make deploy
+   ```
+10. Intentionally break any scrape Target to fire the InstanceDown alert. Look at Telemetry Manager logs, you should see that Prometheus is pushing alerts via the endpoint, which triggers immediate reconciliation.