Skip to content

Commit

Permalink
fix: mimir/loki and tempo recording rules to be able to work with Mim…
Browse files Browse the repository at this point in the history
…ir (#1158)
  • Loading branch information
QuentinBisson authored May 9, 2024
1 parent 7a6f9a7 commit 77f08d8
Show file tree
Hide file tree
Showing 4 changed files with 136 additions and 135 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
### Fixed

- Remove cilium entry from KAAS SLOs.
- Fix Loki/Mimir and Tempo mixins according to `pint` recommendations
- Fix cilium related alerts for mimir.
- Fix etcd alerts for Mimir.
- Add missing labels for apiserver alerts.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,52 +10,52 @@ spec:
- name: loki_rules
rules:
- expr: histogram_quantile(0.99, sum(rate(loki_request_duration_seconds_bucket[1m]))
by (le, cluster, job))
by (le, cluster_id, provider, installation, pipeline, job))
record: cluster_job:loki_request_duration_seconds:99quantile
- expr: histogram_quantile(0.50, sum(rate(loki_request_duration_seconds_bucket[1m]))
by (le, cluster, job))
by (le, cluster_id, provider, installation, pipeline, job))
record: cluster_job:loki_request_duration_seconds:50quantile
- expr: sum(rate(loki_request_duration_seconds_sum[1m])) by (cluster, job) / sum(rate(loki_request_duration_seconds_count[1m]))
by (cluster, job)
- expr: sum(rate(loki_request_duration_seconds_sum[1m])) by (cluster_id, provider, installation, pipeline, job) / sum(rate(loki_request_duration_seconds_count[1m]))
by (cluster_id, provider, installation, pipeline, job)
record: cluster_job:loki_request_duration_seconds:avg
- expr: sum(rate(loki_request_duration_seconds_bucket[1m])) by (le, cluster, job)
- expr: sum(rate(loki_request_duration_seconds_bucket[1m])) by (le, cluster_id, provider, installation, pipeline, job)
record: cluster_job:loki_request_duration_seconds_bucket:sum_rate
- expr: sum(rate(loki_request_duration_seconds_sum[1m])) by (cluster, job)
- expr: sum(rate(loki_request_duration_seconds_sum[1m])) by (cluster_id, provider, installation, pipeline, job)
record: cluster_job:loki_request_duration_seconds_sum:sum_rate
- expr: sum(rate(loki_request_duration_seconds_count[1m])) by (cluster, job)
- expr: sum(rate(loki_request_duration_seconds_count[1m])) by (cluster_id, provider, installation, pipeline, job)
record: cluster_job:loki_request_duration_seconds_count:sum_rate
- expr: histogram_quantile(0.99, sum(rate(loki_request_duration_seconds_bucket[1m]))
by (le, cluster, job, route))
by (le, cluster_id, provider, installation, pipeline, job, route))
record: cluster_job_route:loki_request_duration_seconds:99quantile
- expr: histogram_quantile(0.50, sum(rate(loki_request_duration_seconds_bucket[1m]))
by (le, cluster, job, route))
by (le, cluster_id, provider, installation, pipeline, job, route))
record: cluster_job_route:loki_request_duration_seconds:50quantile
- expr: sum(rate(loki_request_duration_seconds_sum[1m])) by (cluster, job, route)
/ sum(rate(loki_request_duration_seconds_count[1m])) by (cluster, job, route)
- expr: sum(rate(loki_request_duration_seconds_sum[1m])) by (cluster_id, provider, installation, pipeline, job, route)
/ sum(rate(loki_request_duration_seconds_count[1m])) by (cluster_id, provider, installation, pipeline, job, route)
record: cluster_job_route:loki_request_duration_seconds:avg
- expr: sum(rate(loki_request_duration_seconds_bucket[1m])) by (le, cluster, job,
- expr: sum(rate(loki_request_duration_seconds_bucket[1m])) by (le, cluster_id, provider, installation, pipeline, job,
route)
record: cluster_job_route:loki_request_duration_seconds_bucket:sum_rate
- expr: sum(rate(loki_request_duration_seconds_sum[1m])) by (cluster, job, route)
- expr: sum(rate(loki_request_duration_seconds_sum[1m])) by (cluster_id, provider, installation, pipeline, job, route)
record: cluster_job_route:loki_request_duration_seconds_sum:sum_rate
- expr: sum(rate(loki_request_duration_seconds_count[1m])) by (cluster, job, route)
- expr: sum(rate(loki_request_duration_seconds_count[1m])) by (cluster_id, provider, installation, pipeline, job, route)
record: cluster_job_route:loki_request_duration_seconds_count:sum_rate
- expr: histogram_quantile(0.99, sum(rate(loki_request_duration_seconds_bucket[1m]))
by (le, cluster, namespace, job, route))
by (le, cluster_id, provider, installation, pipeline, namespace, job, route))
record: cluster_namespace_job_route:loki_request_duration_seconds:99quantile
- expr: histogram_quantile(0.50, sum(rate(loki_request_duration_seconds_bucket[1m]))
by (le, cluster, namespace, job, route))
by (le, cluster_id, provider, installation, pipeline, namespace, job, route))
record: cluster_namespace_job_route:loki_request_duration_seconds:50quantile
- expr: sum(rate(loki_request_duration_seconds_sum[1m])) by (cluster, namespace,
- expr: sum(rate(loki_request_duration_seconds_sum[1m])) by (cluster_id, provider, installation, pipeline, namespace,
job, route) / sum(rate(loki_request_duration_seconds_count[1m])) by (cluster,
namespace, job, route)
record: cluster_namespace_job_route:loki_request_duration_seconds:avg
- expr: sum(rate(loki_request_duration_seconds_bucket[1m])) by (le, cluster, namespace,
- expr: sum(rate(loki_request_duration_seconds_bucket[1m])) by (le, cluster_id, provider, installation, pipeline, namespace,
job, route)
record: cluster_namespace_job_route:loki_request_duration_seconds_bucket:sum_rate
- expr: sum(rate(loki_request_duration_seconds_sum[1m])) by (cluster, namespace,
- expr: sum(rate(loki_request_duration_seconds_sum[1m])) by (cluster_id, provider, installation, pipeline, namespace,
job, route)
record: cluster_namespace_job_route:loki_request_duration_seconds_sum:sum_rate
- expr: sum(rate(loki_request_duration_seconds_count[1m])) by (cluster, namespace,
- expr: sum(rate(loki_request_duration_seconds_count[1m])) by (cluster_id, provider, installation, pipeline, namespace,
job, route)
record: cluster_namespace_job_route:loki_request_duration_seconds_count:sum_rate
Loading

0 comments on commit 77f08d8

Please sign in to comment.