Skip to content
This repository has been archived by the owner on Aug 2, 2022. It is now read-only.

Add required config changes #264

Closed
wants to merge 1 commit into from

Conversation

kaituo
Copy link
Member

@kaituo kaituo commented Oct 15, 2020

Note: since there are a lot of dependencies, I only list the main class and test code to save reviewers' time. The build will fail due to missing dependencies. I will use that PR just for review. will not merge it. Will have a big one in the end and merge once after all review PRs get approved.

Issue #, if available:

Description of changes:

This PR adds config changes required for multi-entity detectors.

Testing done:

  1. Manually tested nothing breaks with these changes.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

This PR adds config changes required for multi-entity detectors.

Testing done:
1. Manually tested nothing breaks with these changes.
@codecov
Copy link

codecov bot commented Oct 15, 2020

Codecov Report

Merging #264 into master will not change coverage.
The diff coverage is 93.54%.

Impacted file tree graph

@@            Coverage Diff            @@
##             master     #264   +/-   ##
=========================================
  Coverage     73.01%   73.01%           
  Complexity     1461     1461           
=========================================
  Files           164      164           
  Lines          6834     6834           
  Branches        527      527           
=========================================
  Hits           4990     4990           
  Misses         1594     1594           
  Partials        250      250           
Flag Coverage Δ Complexity Δ
#cli 79.27% <ø> (ø) 0.00 <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ Complexity Δ
...stroforelasticsearch/ad/AnomalyDetectorPlugin.java 93.54% <92.85%> (ø) 10.00 <3.00> (ø)
...ticsearch/ad/settings/AnomalyDetectorSettings.java 100.00% <100.00%> (ø) 1.00 <0.00> (ø)

memoryTracker
);

// EntityCache cache = new LRUCache(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove these commented code ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, removed

// cache related
public static final int DEDICATED_CACHE_SIZE = 10;

// We only keep priority (4 bytes float) in inactive cache. 1 million priorities
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we store active entity's priority in inactive entity's cache ? If that's the case, why not name it as MAX_ENTITIES ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't do that. Active entities' states has both models and priority, while inactive entities' states has only priority.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, misunderstanding, thought all priority of active&inactive entity will be stored in inactive cache.

public static int MAX_BULK_CHECKPOINT_SIZE = 1000;

// number of bulk checkpoints per second
public static double CHECKPOINT_BULK_PER_SECOND = 0.02;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can execute 1 bulk index checkpoint every 50 seconds?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, 1/60 is not a finite decimal representation. It is fine.

// take up 4 MB.
public static final int MAX_INACTIVE_ENTITIES = 1_000_000;

// TODO: check how much does 1 million insertion costs in memory
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still have to do a performance test get the accurate number?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems roughly 1 MB per 1 million from Weicong's jmap data.

Copy link
Contributor

@saratvemulapalli saratvemulapalli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes look good to me.

Copy link
Contributor

@ylwu-amzn ylwu-amzn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for the change

kaituo added a commit that referenced this pull request Oct 16, 2020
* Add support filtering the data by one categorical variable

This PR is a conglomerate of the following PRs.

#247
#249
#250
#252
#253
#256
#257
#258
#259
#260
#261
#262
#263
#264
#265
#266
#267
#268
#269

This spreadsheet contains the mappings from files to PR number: https://quip-amazon.com/DiHkAmz9oSLu/HC-PR

Testing done:
1. Add unit tests except four classes (excluded in build.gradle). Will add them in the later PR.
2. Manual testing passes.
@kaituo kaituo closed this Oct 16, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants