Skip to content

Commit

Permalink
Monitoring for ironic-prometheus-exporter
Browse files Browse the repository at this point in the history
This commit adds the changes to enable monitoring
in the machine-api-operator that will allow Prometheus
to collect data from the ironic-prometheus-exporter[1] that
runs in the ironic-image [2].

- Added monitoring label to namespace yaml
- Added monitoring information to the rbac yaml
- Added Service for the ironic-prometheus-exporter
- Added the ServiceMonitor for the ironic-prometheus-exporter
- Added PrometheusRule with alerts for baremetal_temp_celsius
metric.

[1] https://github.com/metal3-io/ironic-prometheus-exporter
[2] https://github.com/metal3-io/ironic-image
  • Loading branch information
iurygregory committed Jul 16, 2019
1 parent 8191e1c commit e25d72e
Show file tree
Hide file tree
Showing 5 changed files with 86 additions and 0 deletions.
1 change: 1 addition & 0 deletions install/0000_30_machine-api-operator_00_namespace.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -7,3 +7,4 @@ metadata:
labels:
name: openshift-machine-api
openshift.io/run-level: "1"
openshift.io/cluster-monitoring: "true"
18 changes: 18 additions & 0 deletions install/0000_30_machine-api-operator_09_rbac.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -269,3 +269,21 @@ subjects:
- kind: ServiceAccount
name: machine-api-operator
namespace: openshift-machine-api

---
- apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: prometheus-k8s
namespace: openshift-monitoring
rules:
- apiGroups:
- ""
resources:
- services
- endpoints
- pods
verbs:
- get
- list
- watch
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
---
apiVersion: monitoring.coreos.com/v1
kind: Service
metadata:
name: metal3-baremetalhost-controller
namespace: openshift-machine-api
labels:
app: ironic-exporter
spec:
ports:
- name: http
protocol: TCP
port: 9608
targetPort: 9608
selector:
app: ironic-exporter
clusterIP: None
type: ClusterIP
sessionAffinity: None
status:
loadBalancer: {}
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
---
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
labels:
app: ironic-exporter
name: metal3-baremetalhost-controller
namespace: openshift-machine-api
spec:
endpoints:
- port: "9608-tcp"
scheme: http
path: /metrics
targetPort: 9608
jobLabel: app
namespaceSelector:
matchNames:
- metal3-baremetalhost-controller
selector:
matchLabels:
app: ironic-exporter
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: metal3-baremetalhost-controller
namespace: openshift-machine-api
spec:
groups:
- name: metal3-baremetalhost-controller
rules:
- alert: HighCPUTemperature
annotations:
summary: "The baremetal node {{ $labels.node_name }} CPU {{ $labels.entity_id }} is too high"
description: "The baremetal node {{ $labels.node_name }} CPU {{ $labels.entity_id }} is too high in the past minute. Last measurement {{ $value }}"
expr: baremetal_temp_celsius > 96
for: 5m
labels:
severity: warning
- alert: LowCPUTemperature
annotations:
summary: "The baremetal node {{ $labels.node_name }} CPU {{ $labels.entity_id }} is too low"
description: "The baremetal node {{ $labels.node_name }} CPU {{ $labels.entity_id }} is too low in the past minute. Last measurement {{ $value }}"
expr: baremetal_temp_celsius < 3
for: 5m
labels:
severity: warning

0 comments on commit e25d72e

Please sign in to comment.