Skip to content

Commit

Permalink
[ML] Add an important note about a gotcha with the delayed data check (
Browse files Browse the repository at this point in the history
…elastic#104725)

Recently a user saw spurious delayed data warnings. These turned
out to be due to accidentally setting `summary_count_field` to a
field that was always zero. This meant that every document was
considered delayed.
  • Loading branch information
droberts195 committed Jan 25, 2024
1 parent d90f198 commit 1682a09
Showing 1 changed file with 16 additions and 0 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,22 @@ for the periods where these delays occur:
[role="screenshot"]
image::images/ml-annotations.png["Delayed data annotations in the Single Metric Viewer"]

[IMPORTANT]
====
As the `doc_count` from an aggregation is compared with the
bucket results of the job, the delayed data check will not work correctly in the
following cases:
* if the datafeed uses aggregations and the job's `analysis_config` does not have its
`summary_count_field_name` set to `doc_count`,
* if the datafeed is _not_ using aggregations and `summary_count_field_name` is set to
any value.
If the datafeed is using aggregations then it's highly likely that the job's
`summary_count_field_name` should be set to `doc_count`. If
`summary_count_field_name` is set to any value other than `doc_count`, the
delayed data check for the datafeed must be disabled.
====
There is another tool for visualizing the delayed data on the *Annotations* tab
in the {anomaly-detect} job management page:

Expand Down

0 comments on commit 1682a09

Please sign in to comment.