Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] Add option for per-partition categorization to categorization job wizard #75061

Merged
merged 46 commits into from
Aug 19, 2020

Conversation

qn895
Copy link
Member

@qn895 qn895 commented Aug 14, 2020

Summary

This is a fixed PR for the original one because I accidentally pulled in changes from Dima's PR.
#74592

Part of meta issue #73968 which adds the ability check for per-partition in categorization job creation wizard.

UI-related changes

  • Categorization job creation wizard

    • Navigating to next screen and back should persist the settings
    • Switching to a different type of detector after setting the per-partition field name should persist the partition_field_name in the detector config
    • Turning off per_partition should also set stop_on_warn to false
      2020-08-14 at 11 21 AM
  • Advanced job creation wizard (when a category field is selected)

Screen Shot 2020-08-14 at 11 24 03 AM

  • If the detectors with mlcategory keyword have different values for partition_field_name then give error feedback upon validation step

Screen Shot 2020-08-14 at 11 42 55 AM

  • If viewing the results of a per-partition categorization job in the Anomaly Explorer, and if stop_on_warn was enabled, add an indicator somewhere in the view if the categorization status goes to warn for a partition. This is needed to explain to the user that there may be fewer results than there could have been.

Screen Shot 2020-08-14 at 11 19 39 AM

Screen Shot 2020-08-14 at 11 20 00 AM

API-related

  • Added api/ml/anomaly_detectors/{jobId}/categorizer_stats for retrieving categorizer_stats documents
  • Added api/ml/anomaly_detectors/{jobId}/stopped_partitions to find out which partitions we stopped categorizing (searching for categorizer_stats documents in the job's results index)
1. First get the job config and check if analysis_config.per_partition_categorization.stop_on_warn is true
2. If so, search for categorizer_stats documents for the current job where the categorization_status is warn
3. Return all the partition_field_value values from the documents found
If analysis_config.per_partition_categorization.stop_on_warn is false then we won’t have stopped categorizing anything, so you can return an empty list

Checklist

qn895 added 29 commits August 6, 2020 11:34
…r-partition

# Please enter a commit message to explain why this merge is necessary,
# especially if it merges an updated upstream into a topic branch.
#
# Lines starting with '#' will be ignored, and an empty message aborts
# the commit.
…r-partition

# Please enter a commit message to explain why this merge is necessary,
# especially if it merges an updated upstream into a topic branch.
#
# Lines starting with '#' will be ignored, and an empty message aborts
# the commit.
- Refactor context -> legacyclient
- Update naming & formatted strings
- Update schema
- Update some functions to useCallback or useMemo
- Rename perPartitionStopOnWarn
…tition

# Conflicts:
#	x-pack/plugins/ml/common/util/job_utils.ts
@jgowdyelastic
Copy link
Member

in advanced wizard with a single detector, if you enable per-partition categorisation but do not add a partition field to the detector the validation fails.
image

I don't know whether this should be caught and displayed on the "Pick fields" step rather than the validation step.
We show a warning about duplicate detectors there:
image

Copy link
Contributor

@peteharverson peteharverson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple more comments related to the new endpoints.

x-pack/plugins/ml/server/routes/results_service.ts Outdated Show resolved Hide resolved
@qn895
Copy link
Member Author

qn895 commented Aug 19, 2020

in advanced wizard with a single detector, if you enable per-partition categorization but do not add a partition field to the detector the validation fails.

I have added validation for both when user adds a detector and for the validation screen in the latest change. The detectors now notify 3 types of error:

Screen Shot 2020-08-18 at 10 05 17 PM

Screen Shot 2020-08-18 at 10 04 52 PM

Screen Shot 2020-08-18 at 10 05 03 PM

Both checks should also now be in the final Job validation page:

Screen Shot 2020-08-18 at 10 23 11 PM

Screen Shot 2020-08-18 at 10 22 43 PM

@@ -273,6 +277,14 @@ export class JobValidator {
this._advancedValidations.categorizationFieldValid.valid = valid;
}

public get categorizerMissingPerPartition() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these should be checked in isPickFieldsStepValid below to disable the Next button.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in 38edcdd

@peteharverson peteharverson changed the title [ML] Add option to add per-partition categorization for cat jobs [ML] Add option for per-partition categorization to categorization job wizard Aug 19, 2020
Copy link
Member

@jgowdyelastic jgowdyelastic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

*
* @api {get} /api/ml/results/:jobId/categorizer_stats
* @apiName GetCategorizerStats
* @apiDescription Returns the categorizer snapshots for the specified job ID
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this say categorizer stats rather than snapshots ?

*
* @api {get} /api/ml/results/category_stopped_partitions
* @apiName GetCategoryStoppedPartitions
* @apiDescription Returns list of partitions we stopped categorizing whens status changed to warn
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo with whens. Suggest changing this to something like

Returns information on the partitions that have stopped being categorized due to the categorization not working well. Can return either the list of stopped partitions for each job, or just the list of job IDs.

[ML] Update api descriptions to be clearer
Copy link
Contributor

@peteharverson peteharverson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested latest edits and LGTM

@kibanamachine
Copy link
Contributor

💚 Build Succeeded

Build metrics

@kbn/optimizer bundle module count

id value diff baseline
ml 1373 +7 1366

async chunks size

id value diff baseline
ml 8.2MB +55.0KB 8.1MB

page load bundle size

id value diff baseline
ml 576.4KB +2.6KB 573.7KB

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants