This repository has been archived by the owner on Aug 2, 2022. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 36
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Codecov Report
@@ Coverage Diff @@
## master #258 +/- ##
============================================
- Coverage 73.01% 72.81% -0.21%
- Complexity 1461 1464 +3
============================================
Files 164 164
Lines 6834 6867 +33
Branches 527 533 +6
============================================
+ Hits 4990 5000 +10
- Misses 1594 1615 +21
- Partials 250 252 +2
Flags with carried forward coverage won't be shown. Click here to find out more.
|
Previously, when creating a model, we evaluate all existing models and compare the total with the 10% heap memory limit. If yes, we proceed to create the model. Otherwise, we throw exceptions. This does not work for multi-entity detectors. First, there can be a lot of models in cache. Reevaluating them every time we want to add a model is not efficient. Second, we have two sources of memory usage now: single-entity and multi-entity detectors. We need a central place to track memory usage across the board as we add more and more kinds of detectors. This PR achieves the purpose. This PR also updates RCF model size estimation. Previously, we underestimated the size. This PR also adds threshold model size estimation. Previously, we didn't consider it. This PR also adds a customized hashmap that can automatically consume and realese memory. This enables minimum change to our single-entity code as we just have to replace the map implementation. Testing done: 1. will add unit tests. 2. end-to-end testing pass.
ohltyler
reviewed
Oct 15, 2020
src/main/java/com/amazon/opendistroforelasticsearch/ad/ml/RCFMemoryAwareConcurrentHashmap.java
Outdated
Show resolved
Hide resolved
ohltyler
reviewed
Oct 15, 2020
src/main/java/com/amazon/opendistroforelasticsearch/ad/MemoryTracker.java
Show resolved
Hide resolved
ohltyler
reviewed
Oct 15, 2020
src/main/java/com/amazon/opendistroforelasticsearch/ad/MemoryTracker.java
Outdated
Show resolved
Hide resolved
ohltyler
reviewed
Oct 15, 2020
src/main/java/com/amazon/opendistroforelasticsearch/ad/ml/RCFMemoryAwareConcurrentHashmap.java
Outdated
Show resolved
Hide resolved
ohltyler
reviewed
Oct 15, 2020
src/main/java/com/amazon/opendistroforelasticsearch/ad/ml/RCFMemoryAwareConcurrentHashmap.java
Show resolved
Hide resolved
src/main/java/com/amazon/opendistroforelasticsearch/ad/MemoryTracker.java
Show resolved
Hide resolved
src/main/java/com/amazon/opendistroforelasticsearch/ad/MemoryTracker.java
Show resolved
Hide resolved
src/main/java/com/amazon/opendistroforelasticsearch/ad/MemoryTracker.java
Outdated
Show resolved
Hide resolved
ohltyler
approved these changes
Oct 15, 2020
weicongs-amazon
approved these changes
Oct 15, 2020
kaituo
added a commit
to kaituo/anomaly-detection
that referenced
this pull request
Oct 16, 2020
This PR is a conglomerate of the following PRs. opendistro-for-elasticsearch#247 opendistro-for-elasticsearch#249 opendistro-for-elasticsearch#250 opendistro-for-elasticsearch#252 opendistro-for-elasticsearch#253 opendistro-for-elasticsearch#256 opendistro-for-elasticsearch#257 opendistro-for-elasticsearch#258 opendistro-for-elasticsearch#259 opendistro-for-elasticsearch#260 opendistro-for-elasticsearch#261 opendistro-for-elasticsearch#262 opendistro-for-elasticsearch#263 opendistro-for-elasticsearch#264 opendistro-for-elasticsearch#265 opendistro-for-elasticsearch#266 opendistro-for-elasticsearch#267 opendistro-for-elasticsearch#268 opendistro-for-elasticsearch#269 This spreadsheet contains the mappings from files to PR number: https://quip-amazon.com/DiHkAmz9oSLu/HC-PR Testing done: 1. Add unit tests except four classes (excluded in build.gradle). Will add them in the later PR. 2. Manual testing passes.
kaituo
added a commit
that referenced
this pull request
Oct 16, 2020
* Add support filtering the data by one categorical variable This PR is a conglomerate of the following PRs. #247 #249 #250 #252 #253 #256 #257 #258 #259 #260 #261 #262 #263 #264 #265 #266 #267 #268 #269 This spreadsheet contains the mappings from files to PR number: https://quip-amazon.com/DiHkAmz9oSLu/HC-PR Testing done: 1. Add unit tests except four classes (excluded in build.gradle). Will add them in the later PR. 2. Manual testing passes.
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Note: since there are a lot of dependencies, I only list the main class and test code to save reviewers' time. The build will fail due to missing dependencies. I will use that PR just for review. will not merge it. Will have a big one in the end and merge once after all review PRs get approved.
Issue #, if available:
Description of changes:
Previously, when creating a model, we evaluate all existing models and compare the total with the 10% heap memory limit. If yes, we proceed to create the model. Otherwise, we throw exceptions. This does not work for multi-entity detectors. First, there can be a lot of models in cache. Reevaluating them every time we want to add a model is not efficient. Second, we have two sources of memory usage now: single-entity and multi-entity detectors. We need a central place to track memory usage across the board as we add more and more kinds of detectors. This PR achieves the purpose.
This PR also updates RCF model size estimation. Previously, we underestimated the size.
This PR also adds threshold model size estimation. Previously, we didn't consider it.
This PR also adds a customized hashmap that can automatically consume and release memory. This enables minimum change to our single-entity code as we just have to replace the map implementation.
Testing done:
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.