Improve out-of-the-box experience with Grafana dashboards in `development/mimir-microservices-mode` stack #4898

charleskorn · 2023-05-03T05:10:12Z

Is your feature request related to a problem? Please describe.

When working on Mimir, the development/mimir-microservices-mode Docker Compose stack is useful for testing and debugging Mimir. This includes a Grafana instance that uses the dashboards from operations/mimir-mixin-compiled.

However, there are some issues with these dashboards and the data behind them:

many targets are scraped three times (by Prometheus, the Grafana agent, and the OTel agent), which means data displayed in dashboards either has 3x series or, in the case of aggregated data, can be 3x the true value
the ... Resources dashboards (eg. Writes Resources) use metrics not available outside Kubernetes, such as container_cpu_usage_seconds_total and container_memory_working_set_bytes
scraped metrics are missing the container label, which breaks many of the dashboard panels that expect this label to be present

Describe the solution you'd like

All dashboards Just Work™ (with the exception of those that only make sense in the context of a Kubernetes installation, such as autoscaling-related dashboards)

Describe alternatives you've considered

Using an instance of Mimir deployed to a Kubernetes environment: this works for some scenarios, but for others this can be a slow feedback loop relative to a local environment.

The text was updated successfully, but these errors were encountered:

charleskorn · 2023-05-03T05:36:00Z

Two of the issues described above (the triple scraping and missing container label) will be fixed by #4900.

charleskorn · 2023-05-04T00:49:03Z

Another feature request: would be good if the recording rules were set up in Mimir's ruler, rather than relying on Prometheus, as this means turning off Prometheus (eg. to test the Grafana Agent) stops the evaluation of recording rules too.

jhalterman · 2023-05-19T00:25:02Z

Some of the read and write dashboards also don't work since cortex_request_duration_seconds_count, and similar, aren't populated. Edit: it appears this was caused by a switch to native histograms in #4987.

For container_cpu_usage_seconds_total and similar, could we just re-create these using grafana agent and some recording rules?

charleskorn · 2023-05-21T22:56:51Z

For container_cpu_usage_seconds_total and similar, could we just re-create these using grafana agent and some recording rules?

Probably - when I ran into this issue, I modified the dashboards to use process_cpu_seconds_total and that seemed to work fine, so perhaps a recording rule that records container_cpu_usage_seconds_total from process_cpu_seconds_total would work?

(4563731 is the commit where I did this)

pstibrany · 2023-05-22T08:21:13Z

Many dashboards use metrics from Kubernetes (from cadvisor), and may be hard to get from inside docker-compose.

jhalterman · 2023-05-22T18:34:45Z

I tried replacing container_cpu_usage_seconds_total with process_cpu_seconds_total and it worked for a few places, but not others, since the labels available on them are a bit different. We'd also need a replacement for container_spec_cpu_period.

charleskorn mentioned this issue May 3, 2023

Improve out-of-the-box experience with Docker Compose stacks for microservices and read-write mode #4900

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve out-of-the-box experience with Grafana dashboards in `development/mimir-microservices-mode` stack #4898

Improve out-of-the-box experience with Grafana dashboards in `development/mimir-microservices-mode` stack #4898

charleskorn commented May 3, 2023

charleskorn commented May 3, 2023

charleskorn commented May 4, 2023

jhalterman commented May 19, 2023 •

edited

Loading

charleskorn commented May 21, 2023

pstibrany commented May 22, 2023

jhalterman commented May 22, 2023

Improve out-of-the-box experience with Grafana dashboards in development/mimir-microservices-mode stack #4898

Improve out-of-the-box experience with Grafana dashboards in development/mimir-microservices-mode stack #4898

Comments

charleskorn commented May 3, 2023

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

Describe alternatives you've considered

charleskorn commented May 3, 2023

charleskorn commented May 4, 2023

jhalterman commented May 19, 2023 • edited Loading

charleskorn commented May 21, 2023

pstibrany commented May 22, 2023

jhalterman commented May 22, 2023

Improve out-of-the-box experience with Grafana dashboards in `development/mimir-microservices-mode` stack #4898

Improve out-of-the-box experience with Grafana dashboards in `development/mimir-microservices-mode` stack #4898

jhalterman commented May 19, 2023 •

edited

Loading