From d8b3572647d643c7bdf96af7749894cefef10100 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Dominik=20S=C3=BC=C3=9F?= Date: Wed, 13 Dec 2023 10:49:33 +0100 Subject: [PATCH] docs: add proposal for grafana alerting This picks up #911 and #1144. The proposal contains three different options for realizing alerting support in the operator and should serve as a base for discussion regarding this topic. --- docs/docs/proposals/002-alerting-support.md | 150 ++++++++++++++++++++ 1 file changed, 150 insertions(+) create mode 100644 docs/docs/proposals/002-alerting-support.md diff --git a/docs/docs/proposals/002-alerting-support.md b/docs/docs/proposals/002-alerting-support.md new file mode 100644 index 000000000..bb9cba4cc --- /dev/null +++ b/docs/docs/proposals/002-alerting-support.md @@ -0,0 +1,150 @@ +--- +title: "Alerting Support" +linkTitle: "Alerting Support" +--- + +## Summary + +Introduce support for Grafana Alerting (requires Grafana > 9). + +This document contains the complete design required to support configuring Alerts with Grafana Operator. + +## Info + +status: Draft + +## Motivation + +With legacy alerting replaced by unfied alerting, users want a way to configure their alerts alongside the rest of their application manifests. + +Currently this is only possible via Terraform. This feature would enable more users to switch to Grafana alerting. + + +## Verification + +- Create integration tests for adding keys for remote Grafana instance. +- Create integration tests to create Grafana dashboards on a remote Grafana Instance. +- Create integration tests to add cloud data sources to remote Grafana instance. + +## Proposal + +This document proposes extending the CRDs to support Alert Rules, Contact Points and Notificaton Policies. + +To realise this, one of several paths can be taken. + +### Option 1: Create new CRDs for every resource. + +This option is based on the [terraform provisioner](https://grafana.com/docs/grafana/v10.2/alerting/set-up/provision-alerting-resources/terraform-provisioning). It results in the following new custom resources: + +`GrafanaAlertFolder`: Folder of Evaluation Groups which contain rules, must be backed by an existing folder. + +It could look like this: +```yaml +--- +apiVersion: grafana.integreatly.org/v1beta1 +kind: GrafanaAlertFolder +metadata: + name: test-alert-folder +spec: + instanceSelector: + matchLabels: + dashboards: "grafana" + groups: + group_one: + interval: 5m + rules: + - for: 5m + grafana_alert: + # ... +``` + +`GrafanaContactPoint`: YAML representation of a contact point + +It could look like this: +```yaml +--- +apiVersion: grafana.integreatly.org/v1beta1 +kind: GrafanaContactPoint +metadata: + name: send-to-slack +spec: + instanceSelector: + matchLabels: + dashboards: "grafana" + title: Send to slack channel + slack: + url: 'https://...' + text: | + {{ len .Alerts.Firing }} alerts are firing! + + Alert summaries: + {{ range .Alerts.Firing }} + {{ template "Alert Instance Template" . }} + {{ end }} + | +``` + +`GrafanaNotificationPolicy`: YAML representation of a notification policy + +```yaml +--- +apiVersion: grafana.integreatly.org/v1beta1 +kind: GrafanaNotificationPolicy +metadata: + name: example-policy +spec: + instanceSelector: + matchLabels: + dashboards: "grafana" + groupBy: ["alertname"] + contactPoint: 'send-to-slack' + groupWait: "45s" + groupInterval: "6m" + repeatInterval: "3h" + policy: + groupBy: ["..."] + matcher: + label: a + match: = + value: b + contactPoint: 'send-to-oncall' + policy: + groupBy: ["..."] + matcher: + label: sublabel + match: = + value: subvalue + contactPoint: 'send-to-email' +``` + +| Pro | Contra | +|------------------------------------------|-----------------------------------------------| +| Simple and straighforward to implement | Lots of repetition to link multiple resources | +| Granular permission management for users | | + + +### Option 2: Extend existing CRDs where possible + +As some resources only make sense in the context of another one (e.g AlertFolder needs an existing folder), we can extend the CRDs of the existing types with new alerting specific information. + +This would result in: +- `GrafanaFolder` having a new section for `alerting` +- `Grafana` having new sections for notification policies and contact points + +| Pro | Contra | +|-------------------------------|-----------------------------------------------------------------------------------------------------------------------| +| No new CRDs | Loss of granularity (What if one team wants to own their contact-point without having access to the Grafana resource?) | +| Easy to check for correctness | | + +### Option 3: Hybrid + +This is a mix between option 1 and 2. +This would result in: +- `GrafanaFolder` having a new section for `alerting` +- New resources `GrafanaContactPoint` and `GrafanaNotificationPolicy` + + +## Related discussions + +- [Issue 911](https://github.com/grafana/grafana-operator/issues/911) +- [PR 1144](https://github.com/grafana/grafana-operator/pull/1144)