Add memory tracker #258

kaituo · 2020-10-14T23:40:11Z

Note: since there are a lot of dependencies, I only list the main class and test code to save reviewers' time. The build will fail due to missing dependencies. I will use that PR just for review. will not merge it. Will have a big one in the end and merge once after all review PRs get approved.

Issue #, if available:

Description of changes:

Previously, when creating a model, we evaluate all existing models and compare the total with the 10% heap memory limit. If yes, we proceed to create the model. Otherwise, we throw exceptions. This does not work for multi-entity detectors. First, there can be a lot of models in cache. Reevaluating them every time we want to add a model is not efficient. Second, we have two sources of memory usage now: single-entity and multi-entity detectors. We need a central place to track memory usage across the board as we add more and more kinds of detectors. This PR achieves the purpose.

This PR also updates RCF model size estimation. Previously, we underestimated the size.

This PR also adds threshold model size estimation. Previously, we didn't consider it.

This PR also adds a customized hashmap that can automatically consume and release memory. This enables minimum change to our single-entity code as we just have to replace the map implementation.

Testing done:

added unit tests.
end-to-end testing pass.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

codecov · 2020-10-14T23:41:42Z

Codecov Report

Merging #258 into master will decrease coverage by 0.20%.
The diff coverage is 34.21%.

@@             Coverage Diff              @@
##             master     #258      +/-   ##
============================================
- Coverage     73.01%   72.81%   -0.21%     
- Complexity     1461     1464       +3     
============================================
  Files           164      164              
  Lines          6834     6867      +33     
  Branches        527      533       +6     
============================================
+ Hits           4990     5000      +10     
- Misses         1594     1615      +21     
- Partials        250      252       +2

Flag	Coverage Δ	Complexity Δ
#cli	`79.27% <ø> (ø)`	`0.00 <ø> (ø)`

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ	Complexity Δ
...est/handler/IndexAnomalyDetectorActionHandler.java	`51.17% <0.00%> (-0.25%)`	`26.00 <0.00> (ø)`
.../handler/IndexAnomalyDetectorJobActionHandler.java	`11.44% <0.00%> (-0.22%)`	`4.00 <0.00> (ø)`
...stroforelasticsearch/ad/model/AnomalyDetector.java	`62.06% <35.71%> (-1.96%)`	`52.00 <0.00> (+1.00)`	⬇️
...oforelasticsearch/ad/model/AnomalyDetectorJob.java	`58.97% <42.85%> (-2.20%)`	`24.00 <1.00> (+2.00)`	⬇️
...oforelasticsearch/ad/AnomalyDetectorJobRunner.java	`76.59% <100.00%> (+0.12%)`	`35.00 <0.00> (ø)`
...ransport/SearchAnomalyDetectorTransportAction.java	`77.77% <0.00%> (-22.23%)`	`2.00% <0.00%> (ø%)`

Previously, when creating a model, we evaluate all existing models and compare the total with the 10% heap memory limit. If yes, we proceed to create the model. Otherwise, we throw exceptions. This does not work for multi-entity detectors. First, there can be a lot of models in cache. Reevaluating them every time we want to add a model is not efficient. Second, we have two sources of memory usage now: single-entity and multi-entity detectors. We need a central place to track memory usage across the board as we add more and more kinds of detectors. This PR achieves the purpose. This PR also updates RCF model size estimation. Previously, we underestimated the size. This PR also adds threshold model size estimation. Previously, we didn't consider it. This PR also adds a customized hashmap that can automatically consume and realese memory. This enables minimum change to our single-entity code as we just have to replace the map implementation. Testing done: 1. will add unit tests. 2. end-to-end testing pass.

src/main/java/com/amazon/opendistroforelasticsearch/ad/ml/RCFMemoryAwareConcurrentHashmap.java

src/main/java/com/amazon/opendistroforelasticsearch/ad/MemoryTracker.java

src/main/java/com/amazon/opendistroforelasticsearch/ad/ml/RCFMemoryAwareConcurrentHashmap.java

src/main/java/com/amazon/opendistroforelasticsearch/ad/MemoryTracker.java

* Add support filtering the data by one categorical variable This PR is a conglomerate of the following PRs. #247 #249 #250 #252 #253 #256 #257 #258 #259 #260 #261 #262 #263 #264 #265 #266 #267 #268 #269 This spreadsheet contains the mappings from files to PR number: https://quip-amazon.com/DiHkAmz9oSLu/HC-PR Testing done: 1. Add unit tests except four classes (excluded in build.gradle). Will add them in the later PR. 2. Manual testing passes.

kaituo requested review from ohltyler and weicongs-amazon October 14, 2020 23:40

kaituo force-pushed the memoryTracker branch from c2a831f to ea3163c Compare October 15, 2020 02:50