Each IDR deployment has its own monitoring system.
Day to day performance of the IDR can be visualised using Grafana, which is accessible using an anonymous account on idr-management:
ssh idr-proxy -L 3000:management:3000
Open http://localhost:3000/.
The dashboards of interest are accessed from the Home
menu:
- IDR PostgreSQL: The numbers of database rows and locks in use
- IDR sessions: Numbers of OMERO sessions and web requests. Note that quantiles are calculated over a fixed window, and will be very inaccurate if the number of requests is low.
- IDR vertical: CPU, active memory, load and network usage for all servers.
If you want to query the raw metrics directly you can connect to prometheus:
ssh idr-proxy -L 9090:management:9090
Open http://localhost:9090/ and you can interactively query the metrics using PromQL (Prometheus Query Language).
The IDR has a experimental centralised logging collector using Fluentd, ElasticSearch and Kibana:
ssh idr-proxy -L 5601:management:5601
Open http://localhost:5601/. Limitations:
- An index pattern
logstash-*
may need to be create on the first access - The ElasticSearch server is not clustered and has limited querying capacity. It can be easily overloaded.
- Logs are kept for two weeks.
All IDR pilot test instances are currently monitored by a single monitoring server.
This is accessed using the same commands as above, using idr-pilot
as the proxy.
The IDR FTP server has limited Prometheus monitoring running on the same server with alerts for low available disk space.
More structure analysis of archived IDR access logs across multiple releases is handled in https://github.com/IDR/idr-log-analysis
All monitoring (central server and node agents) are deployed by the ansible/management*.yml
playbooks.
The central Prometheus server runs in a Docker container deployed by an Ansible rolse and a playbook.
The templated prometheus configuration can be fetched from /etc/prometheus/prometheus.yml
if necessary for investigation, and the templated alerting rules are written to /etc/prometheus/rules/
.
This playbook also deploys all the required Prometheus exporters on all nodes.
Grafana runs in Docker and is deployed by a playbook. The playbook automatically configures Grafana and uploads some pre-created dashboards using Grafana's REST API.
ElasticSearch, Kibana and the Fluentd server are run in Docker and deployed by a playbook. This playbook also deploys the Fluentd logging agents that collect the logs on the OMERO and proxy servers, and forward them to the central Fluentd server.