Skip to content

Commit

Permalink
[DOCS] Per-partition categorization (elastic#61506)
Browse files Browse the repository at this point in the history
  • Loading branch information
lcawl committed Aug 27, 2020
1 parent f29d408 commit 7ccf9f5
Showing 1 changed file with 23 additions and 0 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -98,6 +98,29 @@ to disregard multiple sections of the categorization field value. In this
example, you might create a filter like `[ "\\[statement:.*\\]"]` to remove the
SQL statement from the categorization algorithm.

[discrete]
[[ml-per-partition-categorization]]
== Per-partition categorization

If you enable per-partition categorization, categories are determined
independently for each partition. For example, if your data includes messages
from multiple types of logs from different applications, you can use a field
like the ECS {ecs-ref}/ecs-event.html[`event.dataset` field] as the
`partition_field_name` and categorize the messages for each type of log
separately.

If your job has multiple detectors, every detector that uses the `mlcategory`
keyword must also define a `partition_field_name`. You must use the same
`partition_field_name` value in all of these detectors. Otherwise, when you
create or update a job and enable per-partition categorization, it fails.

When per-partition categorization is enabled, you can also take advantage of a
`stop_on_warn` configuration option. If the categorization status for a
partition changes to `warn`, it doesn't categorize well and can cause a lot of
unnecessary resource usage. When you set `stop_on_warn` to `true`, the job stops
analyzing these problematic partitions. You can thus avoid an ongoing
performance cost for partitions that are unsuitable for categorization.

[discrete]
[[ml-configuring-analyzer]]
== Customizing the categorization analyzer
Expand Down

0 comments on commit 7ccf9f5

Please sign in to comment.