Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: add prometheus-operator example for jobset #629

Merged
merged 4 commits into from
Jul 28, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
67 changes: 67 additions & 0 deletions examples/prometheus-operator/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
# Install Prometheus-operator steps

### Install the prometheus operator

Please follow the [documentation](https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/user-guides/getting-started.md) to install

```bash
# Installing the prometheus operator
root@VM-0-5-ubuntu:/home/ubuntu/jobset/examples/simple# kubectl get pods
NAME READY STATUS RESTARTS AGE
prometheus-operator-76469b7f8c-5wb8x 1/1 Running 0 12h
```

### Install the ServiceMonitor CR for JobSet System

Please follow the [documentation](https://jobset.sigs.k8s.io/docs/installation/#optional-add-metrics-scraping-for-prometheus-operator) or use `make prometheus` to install ServiceMonitor CR

```bash
root@VM-0-5-ubuntu:/home/ubuntu/jobset# make prometheus
kubectl apply --server-side -k config/prometheus
role.rbac.authorization.k8s.io/prometheus-k8s serverside-applied
rolebinding.rbac.authorization.k8s.io/prometheus-k8s serverside-applied
servicemonitor.monitoring.coreos.com/controller-manager-metrics-monitor serverside-applied
```

```bash
root@VM-0-5-ubuntu:/home/ubuntu/jobset# kubectl get ServiceMonitor -n jobset-system
NAME AGE
controller-manager-metrics-monitor 7d11h
```

### Install the Prometheus CR for JobSet System

```bash
root@VM-0-5-ubuntu:/home/ubuntu# kubectl apply -f prometheus.yaml
serviceaccount/prometheus-jobset created
clusterrole.rbac.authorization.k8s.io/prometheus-jobset created
clusterrolebinding.rbac.authorization.k8s.io/prometheus-jobset created
prometheus.monitoring.coreos.com/jobset-metrics created
service/jobset-metrics created
```

```bash
root@VM-0-5-ubuntu:/home/ubuntu# kubectl get pods -n jobset-system
NAME READY STATUS RESTARTS AGE
jobset-controller-manager-76767b599b-v8wcc 2/2 Running 0 6d22h
prometheus-jobset-metrics-0 2/2 Running 0 17s
root@VM-0-5-ubuntu:/home/ubuntu# kubectl get svc -n jobset-system
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
jobset-controller-manager-metrics-service ClusterIP 10.96.187.196 <none> 8443/TCP 7d11h
jobset-metrics NodePort 10.96.217.176 <none> 9090:30900/TCP 28s
jobset-webhook-service ClusterIP 10.96.252.163 <none> 443/TCP 7d11h
prometheus-operated ClusterIP None <none> 9090/TCP 28s
root@VM-0-5-ubuntu:/home/ubuntu#
```

### View metrics using the prometheus UI

```bash
root@VM-0-5-ubuntu:/home/ubuntu# kubectl port-forward services/jobset-metrics 39090:9090 --address 0.0.0.0 -n jobset-system
Forwarding from 0.0.0.0:39090 -> 9090
```

If using kind, we can use port-forward, `kubectl port-forward services/jobset-metrics 39090:9090 --address 0.0.0.0 -n jobset-system`
This allows us to access prometheus using a browser: `http://{ecs public IP}:39090/graph`

![prometheus](./prometheus.png?raw=true)
Binary file added examples/prometheus-operator/prometheus.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
78 changes: 78 additions & 0 deletions examples/prometheus-operator/prometheus.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
apiVersion: v1
kind: ServiceAccount
metadata:
name: prometheus-jobset
namespace: jobset-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: prometheus-jobset
rules:
- apiGroups: [""]
resources:
- nodes
- nodes/metrics
- services
- endpoints
- pods
verbs: ["get", "list", "watch"]
- apiGroups: [""]
resources:
- configmaps
verbs: ["get"]
- nonResourceURLs: ["/metrics"]
verbs: ["get"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: prometheus-jobset
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: prometheus-jobset
subjects:
- kind: ServiceAccount
name: prometheus-jobset
namespace: jobset-system
---
# more details can be found
# https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/api.md#monitoring.coreos.com/v1.Prometheus
apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
name: jobset-metrics
namespace: jobset-system
spec:
serviceAccountName: prometheus-jobset
# Associated ServiceMonitor selector
serviceMonitorSelector:
# Need to match the label in ServiceMonitor
# https://github.com/kubernetes-sigs/jobset/blob/main/config/components/prometheus/monitor.yaml#L7
matchLabels:
control-plane: controller-manager
resources:
requests:
memory: 400Mi
enableAdminAPI: false
---
apiVersion: v1
kind: Service
metadata:
name: jobset-metrics
namespace: jobset-system
spec:
type: NodePort
# Port mapping: Note if deploying with kind for testing, if docker mappings are not exposed when the cluster starts,
# kube port-forward needs to be used for port forwarding
# kubectl port-forward services/jobset-metrics 39090:9090 --address 0.0.0.0 -n jobset-system
ports:
- name: web
nodePort: 30900
port: 9090
protocol: TCP
targetPort: web
# Need to match the name in Prometheus
selector:
prometheus: jobset-metrics