1.9 backport #220

yizheliu-amazon · 2020-08-27T01:30:15Z

Issue #, if available:

Description of changes:
Backport changes from PR #178 (c062263, commit after ODFE 1.9 upgrade) to PR #214 (942a10a, commit before ODFE 1.10 upgrade)

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

* Add result indices retention period Currently we never delete the result index, even though customers have deleted the detector. An increasing amount of result indices can use significant disk space, as well as memory pressure due to the creation of rolled over indices. This PR adds retention period to anomaly results. We delete result indices when they are older than a retention period, which is 90 days by default. We use 90 days because that's the maximum days we allow users to view results on Kibana. Users can configure the retention period via the setting opendistro.anomaly_detection.ad_result_history_retention_period dynamically. Also, previously we roll over empty result indices. This PR fixes that by removing the max age condition of result indices. So we only roll over the result index when the maximum number of documents in the index is reached. Testing done: * manually tested result indices would be deleted after passing retention period.

…h#175) * Add dyanmic setting for memory threshold We have the following two memory threshold: * a detector's model size per node cannot exceed 10% of heap memory * the total model size per node cannot exceed 10% of heap memory. If users use the ES cluster mainly for anomaly detection, they should be able to increase the memory for AD.This PR makes the threshold dynamically adjustable via the setting opendistro.anomaly_detection.model_max_size_percent. Testing done: * Manually verified I can dynamically adjust the memory threshold setting.

…search#182) * Support security plugin enabled testing in CI This change is to support integration testing against remote security-enabled clustering in our CI. Currently, we only support the integration testing without a security plugin and have to remove the security plugin in CI. The critical change is to create an https client with basic auth in ODFERestTestCase and let our rest test case inherit from ODFERestTestCase. With default security configuration, the password for the admin user is also “admin.” For ssl certification, we choose to disable it since it’s the default demo cert used here. Testing done: * ran the scripts on CI/CD workflow on Mac and tested it works. * gradle build still passes.

* Create RFC for AD CLI This is the RFC for the Anomaly Detection CLI feature to be built on AD.

…earch#184)

…arch#164) * Adds initialization progress to profile API This PR adds init_progress to profile API. init_progress helps users track initialization percentage, needed shingles, and estimated time to go if the future data stream is continuous (no missing data). Initialization percentage measures how far away we can observe RCF models emitting scores. The implementation fetches the RCF model's total updates while the AD job is running and materializes the value to the newly added index .opendistro-anomaly-info. Total updates record the number of times this RCF model has been updated. Initialization percent is computed as x/128: * if total updates > 128, x = 128. Otherwise, x is the total updates * 128 is our output after the number in RCF. After observing 128 samples, RCF starts emitting scores. Needed shingles are computed as 128 -x. Estimated minutes to go is computed as needed shingles * detector interval This PR also materializes the error message in the most recent run to speed up profile API's error fetching. During each AD execution, we also check if a checkpoint is there (the result is saved and maintained as other AD states), if yes, we cold start immediately. Testing done: 1. adds unit tests 2. run e2e tests to verify init_progress number makes sense.

…a full shingle (opendistro-for-elasticsearch#176) Adds feature to query data from the index when insufficient data in buffer to form a full shingle. Query results are cached.

…rch#191)

ESAD Version 0.1 to interact with AD Plugin through cli.

* Workflow changes for AD release Update release workflow to be triggered only, when a new tag is pushed ( starts with v ) * Add badge for testing and code coverage Added flag: cli for cli project Added flag: plugin for java * Include threshold * Include flag in code covereage

…or-elasticsearch/ad-multinode Adding multinode integration test support

Change table from right aligned to left aligned.

Remove esad.

Instead of displaying headers with empty list, show message with hints.

For readability, move constants based on alphabetically.

…rch#203) Allows the model shingle size to be set per detector.

…pendistro-for-elasticsearch#210) * Change profile API to not return estimated minutes remaining until cold start is finished Currently, the progress bar on AD Kibana will show the estimated time remaining to initialize a detector. This can be confusing if this message is displayed before cold start is finished, where the actual initialization time may be much shorter if sufficient historical data exists. This PR changes the profile api to not return any estimated time left until the cold start is finished to prevent this. Testing done: 1. Manually verified the problem is fixed. 2. added unit test for the issue.

* Change to use callbacks in cold start This PR changes the code path of cold start in the transport layer to use callbacks. Previously, I created AD’s ExecutorService that has one thread for cold starts in ColdStartRunner. When we need to trigger a cold start, we can submit a task in the ExecutorService, and consult a hash map (keyed by detector Id) that cached the results of recent cold start results. Since I have to invoke the cold start thread in various callbacks, I created a cold start thread pool and put the cold start result in the transport state. This PR also handles new exceptions like invalid queries introduced by recent changes on ModelManager and FeatureManager. This PR lowers the severity of a couple of log messages in HashRing and RCFPollingTransportAction to avoid overwhelming readers of log files. These log messages are common. This PR corrects typos and updates known causes of EndRunException in comments. Testing done: 1. Simulated cold start failures: Exceptions of cold starts can be seen by the transport layer. EndRunException can cause AD jobs to be terminated. 2. Happy case of a cold start still works.

Every exported go method should have comments for readability and usage. Add appropriate commands and sample data is copied from https://opendistro.github.io/for-elasticsearch-docs/docs/ad/api

Added function and test to check get response and regenerated mocks as well

Used by GetDetectorController to deserialize response from gateway to type.

Mapper to convert gateway response to structure.

GetDetector will call gateway to get configuration GetDetectorByName will first call SearchDetector to get ID from Name pattern. Later, it will call GetDetector to get details.

Handler for cat command, this will get list of detector based on id and name pattern

Created command to accept list of detectors to concatenate and print detectors. This is similar to cat command in unix.

goimports fixes imports order and also formats like gofmt.

codecov · 2020-08-27T01:39:24Z

Codecov Report

❗ No coverage uploaded for pull request base (opendistro-1.9@4712ede). Click here to learn what that means.
The diff coverage is n/a.

@@                Coverage Diff                @@
##             opendistro-1.9     #220   +/-   ##
=================================================
  Coverage                  ?   72.36%           
  Complexity                ?     1288           
=================================================
  Files                     ?      139           
  Lines                     ?     6073           
  Branches                  ?      469           
=================================================
  Hits                      ?     4395           
  Misses                    ?     1465           
  Partials                  ?      213

Flag	Coverage Δ	Complexity Δ
#cli	`80.30% <0.00%> (?)`	`0.00% <0.00%> (?%)`
#plugin	`71.30% <0.00%> (?)`	`1288.00% <0.00%> (?%)`

Flags with carried forward coverage won't be shown. Click here to find out more.

kaituo · 2020-08-27T22:48:33Z

CLI workflow fails.

VijayanB · 2020-08-27T23:58:52Z

@yizheliu-amazon can you include this commit(9df7b59) as well to fix workflow failure?

yizheliu-amazon · 2020-08-28T00:41:41Z

@yizheliu-amazon can you include this commit(9df7b59) as well to fix workflow failure?

Thanks for pointing it out. Cherry-picked it.

yizheliu-amazon · 2020-08-28T01:05:12Z

CLI workflow fails.

After cherry-picking new changes, all workflows pass.

ylwu-amzn

LGTM. Thanks for the change.

yizheliu-amazon and others added 30 commits June 25, 2020 14:45

Add release drafter (opendistro-for-elasticsearch#178)

c062263

Add release notes for ODFE 1.9.0.0 (opendistro-for-elasticsearch#181)

94264eb

Create RFC for AD CLI (opendistro-for-elasticsearch#185)

4bf1818

* Create RFC for AD CLI This is the RFC for the Anomaly Detection CLI feature to be built on AD.

change to exhausive search for training data (opendistro-for-elastics…

cd3bdae

…earch#184)

Queries data from the index when insufficient data in buffer to form …

44a8bcd

…a full shingle (opendistro-for-elasticsearch#176) Adds feature to query data from the index when insufficient data in buffer to form a full shingle. Query results are cached.

Automate release notes to unified standard (opendistro-for-elasticsea…

68429a8

…rch#191)

Merge esad cli into anomaly-detection (opendistro-for-elasticsearch#196)

d6dd8ab

ESAD Version 0.1 to interact with AD Plugin through cli.

Add badges to AD (opendistro-for-elasticsearch#199)

d6235c2

Adding multinode integration test support

935409c

Merge pull request opendistro-for-elasticsearch#201 from opendistro-f…

199a1c2

…or-elasticsearch/ad-multinode Adding multinode integration test support

Change Profile List Format

7d8463f

Change table from right aligned to left aligned.

Deleted binary file

afa859b

Remove esad.

Display message on empty profile list

d36fe3e

Instead of displaying headers with empty list, show message with hints.

Refactor: Group Constants alphabetically

e440e8b

For readability, move constants based on alphabetically.

Allows window size to be set per detector. (opendistro-for-elasticsea…

f03f63f

…rch#203) Allows the model shingle size to be set per detector.

minor edits to the short and long text strings

981db87

fix job index mapping (opendistro-for-elasticsearch#212)

9a0166e

Include comments for gateway methods

14a1bde

Every exported go method should have comments for readability and usage. Add appropriate commands and sample data is copied from https://opendistro.github.io/for-elasticsearch-docs/docs/ad/api

Add GetDetector Gateway

698782a

Added function and test to check get response and regenerated mocks as well

Entity: Added entity to deserialize response

275317d

Used by GetDetectorController to deserialize response from gateway to type.

Mapper: function to map from json to structure

02fe5d8

Mapper to convert gateway response to structure.

Controller: Get Detector based on Id and Name

e93a9cf

GetDetector will call gateway to get configuration GetDetectorByName will first call SearchDetector to get ID from Name pattern. Later, it will call GetDetector to get details.

Handler: GetDetector handler to call controller

7a8c0f6

Handler for cat command, this will get list of detector based on id and name pattern

VijayanB and others added 2 commits August 20, 2020 12:04

Command: Cat command to concatenate and print detectors

4b52e24

Created command to accept list of detectors to concatenate and print detectors. This is similar to cat command in unix.

Use goimports (opendistro-for-elasticsearch#214)

942a10a

goimports fixes imports order and also formats like gofmt.

yizheliu-amazon requested review from kaituo, VijayanB, ylwu-amzn and wnbts August 27, 2020 01:30

Install go get outside cli directory (opendistro-for-elasticsearch#216)

087e910

ylwu-amzn approved these changes Aug 28, 2020

View reviewed changes

kaituo approved these changes Aug 28, 2020

View reviewed changes

VijayanB approved these changes Aug 28, 2020

View reviewed changes

yizheliu-amazon merged commit 0b3807c into opendistro-for-elasticsearch:opendistro-1.9 Aug 28, 2020

ohltyler added the infra label Oct 19, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

1.9 backport #220

1.9 backport #220

yizheliu-amazon commented Aug 27, 2020

codecov bot commented Aug 27, 2020 •

edited

Loading

kaituo commented Aug 27, 2020

VijayanB commented Aug 27, 2020

yizheliu-amazon commented Aug 28, 2020

yizheliu-amazon commented Aug 28, 2020

ylwu-amzn left a comment

1.9 backport #220

1.9 backport #220

Conversation

yizheliu-amazon commented Aug 27, 2020

codecov bot commented Aug 27, 2020 • edited Loading

Codecov Report

kaituo commented Aug 27, 2020

VijayanB commented Aug 27, 2020

yizheliu-amazon commented Aug 28, 2020

yizheliu-amazon commented Aug 28, 2020

ylwu-amzn left a comment

Choose a reason for hiding this comment

codecov bot commented Aug 27, 2020 •

edited

Loading