Skip to content

Commit

Permalink
Added alerts for failing connectors and tasks as these can not be aut…
Browse files Browse the repository at this point in the history
…omatically recovered and need manual intervention.

Signed-off-by: Laszlo I. Hunyady <[email protected]>
  • Loading branch information
Laszlo committed Jul 8, 2024
1 parent c8b5d9d commit 943d27b
Show file tree
Hide file tree
Showing 2 changed files with 28 additions and 0 deletions.
14 changes: 14 additions & 0 deletions examples/metrics/prometheus-install/prometheus-rules.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -159,6 +159,20 @@ spec:
annotations:
summary: 'All Kafka Connect containers down or in CrashLookBackOff status'
description: 'All Kafka Connect containers have been down or in CrashLookBackOff status for 3 minutes'
- alert: ConnectFailedConnector
expr: sum(kafka_connect_connector_status{status="failed"}) > 0
labels:
severity: major
annotations:
summary: 'Kafka Connect Connector Failure'
description: 'Some connectors are failing, this can not be automatically recovered.'
- alert: ConnectFailedTask
expr: sum(kafka_connect_worker_connector_failed_task_count) > 0
labels:
severity: major
annotations:
summary: 'Kafka Connect Task Failure'
description: 'Some tasks are failing, this can not be automatically recovered.'
- name: bridge
rules:
- alert: BridgeContainersDown
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -159,6 +159,20 @@ spec:
annotations:
summary: 'All Kafka Connect containers down or in CrashLookBackOff status'
description: 'All Kafka Connect containers have been down or in CrashLookBackOff status for 3 minutes'
- alert: ConnectFailedConnector
expr: sum(kafka_connect_connector_status{status="failed"}) > 0
labels:
severity: major
annotations:
summary: 'Kafka Connect Connector Failure'
description: 'Some connectors are failing, this can not be automatically recovered.'
- alert: ConnectFailedTask
expr: sum(kafka_connect_worker_connector_failed_task_count) > 0
labels:
severity: major
annotations:
summary: 'Kafka Connect Task Failure'
description: 'Some tasks are failing, this can not be automatically recovered.'
- name: bridge
rules:
- alert: BridgeContainersDown
Expand Down

0 comments on commit 943d27b

Please sign in to comment.