ML Detection of Duration Anomalies #61348

andrewvc · 2019-07-24T01:52:59Z

This issue is to track adding ML support to our duration charts on the monitor details page. This is a great way to start integrating ML into uptime. We'd like to start showing:

The baseline average duration
Highlight spikes and drops in response times as anomalous. We should consider these warnings not errors for this initial MVP.

Open questions:

Do we show these as warnings or info? Visually, do we communicate this with a yellow or more neutral color?

Implementation Notes
Check with APM & SIEM ML integrations on how they:

Check for License, recommend Trial if not already Trial or Platinum
Check for resources to run ML
Enable ML (with set of ML jobs)
Provide error messages if too much is asked
Stop ML (and delete ML Jobs)

@katrin-freihofner this might be good to add to our mocks for our redesigned monitor details page.

grabowskit · 2019-08-08T19:09:34Z

Sample of SIEM/ML UI integration...

Sample of APM/ML UI integration...

grabowskit · 2019-08-08T19:20:20Z

Example of APM chart with integrated ML results...

katrin-freihofner · 2019-10-08T14:21:04Z

We have something very similar for the logs UI planned:

I think this could work for uptime too.

katrin-freihofner · 2019-10-08T14:39:10Z

This is how it could turn out for the duration chart:

Titch990 · 2019-10-08T15:41:16Z

@katrin-freihofner I just want to say that all these look great!

Titch990 · 2019-10-08T15:44:20Z

I also have a couple of comments about the UI text in some of the earlier screenshots above. I'm not sure how far down the line these changes are, and hence whether it is appropriate to comment yet. Also, I wonder who is a good person to raise these points with initially? @gchaps perhaps?

Point 1: I think the text in both the SEIM Anomaly detection settings dialog, and the APM Enable anomaly detection dialog could be tightened up a bit. I'm happy to help with this.

Point 2: I'm a bit concerned about the use of the word "Integrations" in the APM/ML UI integration. I may be worrying needlessly, and perhaps the term has already been agreed, but I'd we also already have a different kind of "Integration" in Observability. This other "Integration" will appear in the UI and documentation shortly and may cause confusion.

This other integration is an integration with a third party service, for example, GCP, Docker, MySQL etc. It refers to the mechanism by which we set up (or integrate with) a new data source to deliver logs and metrics data. This usage of "integration" seems to be fairly standard across many third party vendors, not just us.

So in the "Sample of APM/ML UI integration" screenshot above, it's possible that the user may expect the other kind of Observability "integration" rather than what I think is an integration with our machine learning app. I think "Integration" is a very generic term, so perhaps it may be better to choose a more specific term that focuses on what kind of integration this is, or what problem the integration solves for the user, for example "ML integrations" or "Anomaly detection". I think in the Logs app, the Machine learning integration is on a tab called "Analysis", so perhaps that's something else to consider and use consistently across the Observability apps?

drewpost · 2019-12-17T13:40:06Z

This is how it could turn out for the duration chart:

@katrin-freihofner In this example with multiple series, how would the user know which series that anomaly highlighting pertains to?

katrin-freihofner · 2019-12-23T10:20:51Z

@drewpost like discussed, these (red and yellow) indicators are suggesting that there is an anomaly. Similar to the Logs UI, there needs to be a tooltip and a button to drill-down to the ML view for further details.

katrin-freihofner · 2020-03-05T11:31:35Z

Loom walk-through https://www.loom.com/share/963531dee13e472796ad51768c8c718a

katrin-freihofner · 2020-03-10T11:03:40Z

Design issue

elasticmachine · 2020-03-25T20:09:45Z

Pinging @elastic/uptime (Team:uptime)

andrewvc · 2020-03-30T18:30:27Z

Fixed in #59785

andrewvc · 2020-03-30T22:20:38Z

Passed test plan perfectly. Seemed to detect anomalies. Creation / linking / deletion of jobs went smoothly.

andrewvc changed the title ~~ML Anomaly Mappings~~ ML Duration Anomalies Aug 8, 2019

andrewvc changed the title ~~ML Duration Anomalies~~ ML Detection of Duration Anomalies Aug 8, 2019

shahzad31 assigned shahzad31 and unassigned shahzad31 Feb 10, 2020

shahzad31 closed this as completed Mar 25, 2020

andrewvc transferred this issue from elastic/uptime Mar 25, 2020

andrewvc added enhancement New value added to drive a business result Team:Uptime - DEPRECATED Synthetics & RUM sub-team of Application Observability test-plan v7.7.0 labels Mar 25, 2020

andrewvc self-assigned this Mar 30, 2020

andrewvc added the test-plan-ok issue has passed test plan label Mar 30, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ML Detection of Duration Anomalies #61348

ML Detection of Duration Anomalies #61348

andrewvc commented Jul 24, 2019

grabowskit commented Aug 8, 2019

grabowskit commented Aug 8, 2019

katrin-freihofner commented Oct 8, 2019

katrin-freihofner commented Oct 8, 2019

Titch990 commented Oct 8, 2019

Titch990 commented Oct 8, 2019

drewpost commented Dec 17, 2019

katrin-freihofner commented Dec 23, 2019

katrin-freihofner commented Mar 5, 2020

katrin-freihofner commented Mar 10, 2020

elasticmachine commented Mar 25, 2020

andrewvc commented Mar 30, 2020

andrewvc commented Mar 30, 2020

ML Detection of Duration Anomalies #61348

ML Detection of Duration Anomalies #61348

Comments

andrewvc commented Jul 24, 2019

grabowskit commented Aug 8, 2019

grabowskit commented Aug 8, 2019

katrin-freihofner commented Oct 8, 2019

katrin-freihofner commented Oct 8, 2019

Titch990 commented Oct 8, 2019

Titch990 commented Oct 8, 2019

drewpost commented Dec 17, 2019

katrin-freihofner commented Dec 23, 2019

katrin-freihofner commented Mar 5, 2020

katrin-freihofner commented Mar 10, 2020

elasticmachine commented Mar 25, 2020

andrewvc commented Mar 30, 2020

andrewvc commented Mar 30, 2020