Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[APM] Threshold alerts in APM #170

Closed
nehaduggal opened this issue Nov 18, 2019 · 4 comments
Closed

[APM] Threshold alerts in APM #170

nehaduggal opened this issue Nov 18, 2019 · 4 comments
Assignees

Comments

@nehaduggal
Copy link

nehaduggal commented Nov 18, 2019

Summary of the problemWe are adding support for additional threshold based alerts in APM. A user will now be able to create an alert when the response time or the error rate of the monitored application breaches a certain configured threshold.

Ideal solution (optional)

Create an alert workflow:

  • Find how to create an alert.
  • Ability to set a threshold alert for response times and error count along with severity from the UI.
  • Set a minimum violation duration in order to prevent noisy alerts from firing.
  • View existing alerts that are set up for that particular service.
  • Ability to create actions when an alert is triggered. Actions include email/slack notifications with graphs(where applicable) of the violated threshold, integration with tools like pager duty/jira.

Visualize triggered alerts:

  • Visualize when an alert violation happened on existing charts and graphs that informs a customer the time duration which the metric they were alerting on slipped below the threshold.
  • An activity stream to view all of the alerts that have been fired for that particular application/service.
  • Indicate a node on the service map if it triggered an alert in the selected timeframe

List known (technical) restrictions and requirements
For example: has to be scalable from 0-15k containers

Are there any pages or actions that relate to this feature?
For example: link to other solutions for further investigation

  • Refer to any related/depending issues
  • If this is already scheduled add the appropriate version label
  • Make sure the design label is added
@formgeist
Copy link
Contributor

formgeist commented Jan 16, 2020

I've been putting together some screens for the threshold alerts feature, which I'd like to share. Since there's a number of flows and screen to go through, I've chosen to record a short video where I go through it all.

📹Walkthrough video

Alternatively, you can go through the Figma prototype yourself and leave comments on the different screens if you have feedback, or write it up here.

🖥Figma prototype updated

@formgeist
Copy link
Contributor

@formgeist
Copy link
Contributor

Design update, 16 Mar 2020

🖥Figma prototype


Made some updates to the prototype and containing screens. Primarily setting things up for consistency with the other Observability apps on the wording and expression editor.

Create flyout

Errors

Additionally made some minor visual changes to the alert indicator on the Service maps;

Selected service

@formgeist formgeist changed the title Design: Threshold alerts in APM [APM] Threshold alerts in APM Mar 16, 2020
@formgeist
Copy link
Contributor

Closing as I've briefed @dgieselaar on the latest changes earlier and this design will, for the most part, be implemented in elastic/kibana#59566 (already in progress).

The remaining tasks will be in separate issues to be dealt with in a later iteration and will be referenced back to this issue.

Opened a separate issue for the Service maps alert indication to be implemented in a later iteration. #228

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants