-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ML] Avoid possible datafeed infinite loop with filtering aggregations #104722
[ML] Avoid possible datafeed infinite loop with filtering aggregations #104722
Conversation
When advancing a datafeed's search interval past a period with no data, always advance by at least one time chunk. This avoids a problem where the simple aggregation used to advance time might think there is data while the datafeed's own aggregation has filtered it all out. Prior to this change, this could cause the datafeed to go into an infinite loop. After this change the worst that can happen is that we step slowly through a period where filtering inside the datafeed's aggregation is causing empty buckets. Fixes elastic#104699
Pinging @elastic/ml-core (Team:ML) |
Hi @droberts195, I've created a changelog YAML for you. |
// where setUpChunkedSearch() thinks data exists at the current start time | ||
// while the datafeed's own aggregation doesn't, at least we'll step forward | ||
// a little bit rather than go into an infinite loop. | ||
currentStart += chunkSpan; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just double checking, but adding this wouldn't accidentally skip some data if the data did exist?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, because if there was data then the test on line 174 should have passed and we should have returned the data on line 175, so exited from the function before this point.
On line 172 the time period we searched was [currentStart, currentStart + chunkSpan)
, so we should have found the data then. This is why it should be completely safe to step forward by chunkSpan
. It might be possible to step forward further, which is what line 197 is trying to do.
💚 Backport successful
|
elastic#104722) When advancing a datafeed's search interval past a period with no data, always advance by at least one time chunk. This avoids a problem where the simple aggregation used to advance time might think there is data while the datafeed's own aggregation has filtered it all out. Prior to this change, this could cause the datafeed to go into an infinite loop. After this change the worst that can happen is that we step slowly through a period where filtering inside the datafeed's aggregation is causing empty buckets. Fixes elastic#104699
#104722) (#104727) When advancing a datafeed's search interval past a period with no data, always advance by at least one time chunk. This avoids a problem where the simple aggregation used to advance time might think there is data while the datafeed's own aggregation has filtered it all out. Prior to this change, this could cause the datafeed to go into an infinite loop. After this change the worst that can happen is that we step slowly through a period where filtering inside the datafeed's aggregation is causing empty buckets. Fixes #104699
elastic#104722) When advancing a datafeed's search interval past a period with no data, always advance by at least one time chunk. This avoids a problem where the simple aggregation used to advance time might think there is data while the datafeed's own aggregation has filtered it all out. Prior to this change, this could cause the datafeed to go into an infinite loop. After this change the worst that can happen is that we step slowly through a period where filtering inside the datafeed's aggregation is causing empty buckets. Fixes elastic#104699
When advancing a datafeed's search interval past a period with no data, always advance by at least one time chunk. This avoids a problem where the simple aggregation used to advance time might think there is data while the datafeed's own aggregation has filtered it all out. Prior to this change, this could cause the datafeed to go into an infinite loop. After this change the worst that can happen is that we step slowly through a period where filtering inside the datafeed's aggregation is causing empty buckets.
Fixes #104699