Example Queries for Prometheus
These values are collected by using the collectd-exporter.
Install it with:
go get github.com/prometheus/collectd_exporter
Use the follwing systemd-file:
[Unit]
Description=CollectD Exporter for Prometheus
After=network-online.target
[Service]
User=prometheus
Restart=on-failure
ExecStart=/usr/local/go-work/bin/collectd_exporter -collectd.listen-address=:25826
[Install]
WantedBy=multi-user.target
Enable start-on-boot:
systemctl daemon-reload
systemctl enable collectd-exporter
systemctl start collectd-exporter
Returns the percentage of time at which all CPUs on the hosts were not idling.
100 * (1 - sum(collectd_cpu_total{type="idle"}) by (exported_instance) / sum(collectd_cpu_total) by (exported_instance))
The same query grouped by CPU cores for one host (replace $server_name
):
100 * (1 - sum(collectd_cpu_total{exported_instance="$server_name", type="idle"}) by (cpu) / sum(collectd_cpu_total{exported_instance="$server_name"}) by (cpu))
Top 10 of non-idling CPUs on all servers:
topk(10, 100 * (1 - sum(collectd_cpu_total{type="idle"}) by (exported_instance, cpu) / sum(collectd_cpu_total) by (exported_instance, cpu)))
Returns the percentage of non-free memory on all servers:
100 * (1 - sum(collectd_memory{memory="free"}) by (exported_instance) / sum(collectd_memory) by (exported_instance))
If you want to count cached memory as free use this:
100 * (1 - sum(collectd_memory{memory=~"free|cached"}) by (exported_instance) / sum(collectd_memory) by (exported_instance))
sum(rate({__name__=~"collectd_disk_disk_ops_\\d_total",disk=~"vda\\d+"}[10m])) by (exported_instance, disk) > 0
sum(rate({__name__=~"collectd_disk_disk_octets_\\d_total",disk=~"vda\\d+"}[10m])) by (exported_instance, disk) > 0
collectd_nginx_nginx_connections{nginx!="waiting"}
rate(collectd_nginx_nginx_requests_total[2m])
collectd_apache_apache_scoreboard{type!="open"} > 0
rate(collectd_apache_apache_requests_total[2m])