-
Notifications
You must be signed in to change notification settings - Fork 18
Add missing feature alert if recent feature data is missing #248
Add missing feature alert if recent feature data is missing #248
Conversation
Why don't show yellow warning if 1 feature data point missing? So user can know the data ingestion may have problem earlier. |
It may be too sensitive if latest 1 data point is missing, and ingestion is recovered soon. This may happen if data ingestion is delayed occasionally. If latest 2 consecutive data points are missing, it can be considered as consistent behavior and we should let user be aware of this. |
Issue associated with this: #249 |
… enabled time; fix some minor bugs
After exploring, I didn't find good way to remove the connecting line. Issue created: #250 |
Yeah, not a big deal, thanks for looking into it! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
props.detectorInterval.interval, | ||
getFeatureMissingAnnotationDateRange( | ||
props.dateRange, | ||
props.detectorEnabledTime |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So we will only show missing data alerts from latest detector enabled time? Is it possible we show alerts for data before?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, only since latest enabled time. Ideally, we should show alert after detector 1st enabled time, and we should not show data during time period when detector disabled. However, there is no way to keep the 1st enabled time, also no way to keep track of which time period detector is disabled. If detector is disabled/enabled multiple times, it is hard for us to ignore the disabled time periods. That's why we only show alerts since latest enabled time.
// If array size is 100K, `findAnomalyWithMaxAnomalyGrade` | ||
// takes less than 2ms by average, while `Array#reduce` | ||
// takes about 16ms by average and`Array#sort` | ||
const calculateSampleWindowsWithMaxDataPoints = ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feature data points are already sampled, should we sample again for the missing data points? Can we show missing alert annotation for every feature data points.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using sampled feature data points in will cause that some missing data points can not be captured, because when we sample anomaly results, we collect all the data for fixed time range, if both missing data points and existing data points are there in that range, we will not be able to catch the missing ones.
) { | ||
const isExisting = findTimeExistsInWindow( | ||
existingTimes, | ||
getRoundedTimeInMin(currentTime), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why get rounded time in minute? Are the existingTimes
rounded time? The feature data point shown on feature breakdown chart is using rounded time? I remember only the live chart on AD result page using rounded time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. I guess I somehow changed code to use getFloorPlotTime
for existingTimes
, should use getRoundedTimeInMin
instead. AD result page is using rounded number for anomaly. The reason to use rounded time is that some data points does not have strict time intervals between each other, usually a small number offset in timestamp, this may cause that some data points doesn't show up in expected time period due to the offset, and then we think there is missing data in that time period.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will use rounded or floor time in both AD result chart and feature breakdown chart? If user deep dive into anomaly result by zooming in AD result chart and check raw data in Kibana discover, it may bring some confusion as AD result chart using rounded time. User may find raw data can't match AD result chart.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry for the confusion. Currently, rounded number is used for anomaly grade, confidence on AD result page, not for timestamp. Source of Math.round usage. Rounded time in this PR is for alert annotation timestamp only. It won't cause any mismatch.
: featureData | ||
.map(feature => getFloorPlotTime(feature.startTime)) | ||
.filter(featureTime => featureTime != undefined); | ||
for ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The time complexity of this block is O(m*n). m is intervals, n is existingTimes count. Have you tested the performance? Suggest to use binary search in function findTimeExistsInWindow
as existingTimes is ordered. If the performance is ok, we can add some todo currently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
currently with 7 days data performance seems ok. will change to use binary search or other improvement for findTimeExistsInWindow
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks for the change!
Issue #, if available:
Description of changes:
Add missing feature alert if recent feature data is missing
If latest 2 feature data points of any feature are missing, we show yellow warning like below
If latest 3 feature data points of any feature are missing, we show red alert below:
Regardless of yellow or red alert, as long as there exists missing feature data, there is annotation on the feature chart.
Meanwhile, I will check with tech writer on wording change >> Done
When data point is missing, below is shown:
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.