Choose the number of primary shards while creating indices #252

kaituo · 2020-10-14T01:27:20Z

Note: since there are a lot of dependencies, I only list the main class and test code to save reviewers' time. The build will fail due to missing dependencies. I will use that PR just for review. will not merge it. Will have a big one in the end and merge once after all review PRs get approved.

Issue #, if available:

Description of changes:
AD is bottle-necked by the number of primary shards of job, result, and checkpoint index in HC. The number of primary shards in the job index determines how many nodes can run as AD's coordinating nodes. The number of primary shards in the result and checkpoint index determines the extent of index pressure given the same indexing workload.

Previously, we used the default setting: in ODFE, the number is 1; in AES, the number is 5. This PR uses the number of hot nodes as the number of primary shards for the checkpoint, result, and job index . The upper limit is 10.

Testing done:

added unit tests.
end-to-end testing

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

AD is bottlenecked by the number of primary shards of job, result, and checkpoint index in HC. The number of primary shards in the job index determines how many nodes can run as AD's coordinating nodes. The number of primary shards in the result and checkpoint index determines the extent of index pressure given the same indexing workload. Previously, we used the default setting: in ODFE, the number is 1; in AES, the number is 5. This PR uses the number of hot nodes as the number of primary shards for the checkpoint, result, and job index . The upper limit is 10. Testing done: 1. added unit tests. 2. end-to-end testing

codecov · 2020-10-14T01:28:53Z

Codecov Report

Merging #252 into master will not change coverage.
The diff coverage is 84.61%.

@@            Coverage Diff            @@
##             master     #252   +/-   ##
=========================================
  Coverage     73.01%   73.01%           
  Complexity     1461     1461           
=========================================
  Files           164      164           
  Lines          6834     6834           
  Branches        527      527           
=========================================
  Hits           4990     4990           
  Misses         1594     1594           
  Partials        250      250

Flag	Coverage Δ	Complexity Δ
#cli	`79.27% <ø> (ø)`	`0.00 <ø> (ø)`

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ	Complexity Δ
...sticsearch/ad/indices/AnomalyDetectionIndices.java	`61.87% <84.21%> (ø)`	`23.00 <6.00> (ø)`
...orelasticsearch/ad/util/DiscoveryNodeFilterer.java	`100.00% <100.00%> (ø)`	`6.00 <1.00> (ø)`

src/main/java/com/amazon/opendistroforelasticsearch/ad/indices/AnomalyDetectionIndices.java

src/main/resources/mappings/checkpoint.json

src/main/java/com/amazon/opendistroforelasticsearch/ad/indices/AnomalyDetectionIndices.java

src/main/resources/mappings/checkpoint.json

* Add support filtering the data by one categorical variable This PR is a conglomerate of the following PRs. #247 #249 #250 #252 #253 #256 #257 #258 #259 #260 #261 #262 #263 #264 #265 #266 #267 #268 #269 This spreadsheet contains the mappings from files to PR number: https://quip-amazon.com/DiHkAmz9oSLu/HC-PR Testing done: 1. Add unit tests except four classes (excluded in build.gradle). Will add them in the later PR. 2. Manual testing passes.

kaituo requested review from weicongs-amazon and ohltyler October 14, 2020 01:27

ohltyler reviewed Oct 14, 2020

View reviewed changes

src/main/java/com/amazon/opendistroforelasticsearch/ad/indices/AnomalyDetectionIndices.java Show resolved Hide resolved

weicongs-amazon reviewed Oct 14, 2020

View reviewed changes

weicongs-amazon approved these changes Oct 14, 2020

View reviewed changes

src/main/resources/mappings/checkpoint.json Show resolved Hide resolved

ohltyler approved these changes Oct 14, 2020

View reviewed changes

Add supporting class

cf087e2

kaituo mentioned this pull request Oct 16, 2020

Add support filtering the data by one categorical variable #270

Merged

kaituo closed this Oct 16, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Choose the number of primary shards while creating indices #252

Choose the number of primary shards while creating indices #252

kaituo commented Oct 14, 2020

codecov bot commented Oct 14, 2020 •

edited

Loading

Choose the number of primary shards while creating indices #252

Choose the number of primary shards while creating indices #252

Conversation

kaituo commented Oct 14, 2020

codecov bot commented Oct 14, 2020 • edited Loading

Codecov Report

codecov bot commented Oct 14, 2020 •

edited

Loading