Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create Prometheus alerts for applications health #18

Open
4 tasks
ghost opened this issue Jul 3, 2019 · 1 comment
Open
4 tasks

Create Prometheus alerts for applications health #18

ghost opened this issue Jul 3, 2019 · 1 comment

Comments

@ghost
Copy link

ghost commented Jul 3, 2019

Our current Prometheus installation has alerts for default Kubernetes cluster health like CPU and Memory usage and Pods status but we don't have alerts for the internals of applications like:

We should create those alerts to get notified in case of any issue as soon as possible.

(Please add any other metrics and alerts that you think is necessary)

@ghost ghost changed the title Create Prometheus alerts for applicaions health Create Prometheus alerts for applications health Jul 3, 2019
@yatharthranjan
Copy link
Member

yatharthranjan commented Jul 3, 2019

Some things that could be useful -

  • Kafka Connector Healths.
  • HDFS health (Number of live data nodes) and Capacity remaining.
  • Disk usage for block devices.
  • HDFS restructure time taken and run failure.
  • Kafka brokers throughput and multiple topic under replication.Although these are provided in kafka-manager too but no alerting there.

Edit:

  • Kafka producer and consumer metrics could also be useful. For example, consumer lag.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant