Add pagination to HCAD feature query #76

kaituo · 2021-05-26T21:41:26Z

Note: since there are a lot of dependencies, I only list the main class and test code to save reviewers' time. The build will fail due to missing dependencies. I will use that PR just for review. will not merge it. Will have a big one in the end and merge once after all review PRs get approved. Now the code is missing unit tests. Will add unit tests, run performance tests, and fix bugs before the official release.

Description

HCAD issues a query to fetch feature data for each entity regularly. HCAD v1 limit the number of entity values returned per query to 1000, both for reasons of efficiency (fetching the overall entity values is expensive and may not finish before the next scheduled query run starts) and for reasons of ease-of-load-shedding (the top 1000 limit, in turn, curtails the memory used to host models, the disk access needed to read/write model checkpoints and anomaly results, the CPU used for entity metadata maintenance and model training/inference, and the garbage collection for deleted models and metadata). Users can increase the maximum entity setting at the cost of a linear growth resource usage, which opens the door to cluster instability. However, such a heuristic “best-effort” limit is detrimental to the scalability of entity monitoring. Even if a user scales out, the number of monitored entities does not increase.

This PR changes to use pagination to fetch entities. If there are more than 1000 entities, we will fetch them in the next page. We implement pagination with composite query. Results are decomposed into a set of 1000-entity pages. Each page encapsulates aggregated values for each entity and is sent to model nodes according to the hash ring mapping from entity model Id to a data node.

Testing done:

Manually tested pagination works.

Check List

[ Y ] Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

HCAD issues a query to fetch feature data for each entity regularly. HCAD v1 limit the number of entity values returned per query to 1000, both for reasons of efficiency (fetching the overall entity values is expensive and may not finish before the next scheduled query run starts) and for reasons of ease-of-load-shedding (the top 1000 limit, in turn, curtails the memory used to host models, the disk access needed to read/write model checkpoints and anomaly results, the CPU used for entity metadata maintenance and model training/inference, and the garbage collection for deleted models and metadata). Users can increase the maximum entity setting at the cost of a linear growth resource usage, which opens the door to cluster instability. However, such a heuristic “best-effort” limit is detrimental to the scalability of entity monitoring. Even if a user scales out, the number of monitored entities does not increase. This PR changes to use pagination to fetch entities. If there are more than 1000 entities, we will fetch them in the next page. We implement pagination with composite query. Results are decomposed into a set of 1000-entity pages. Each page encapsulates aggregated values for each entity and is sent to model nodes according to the hash ring mapping from entity model Id to a data node. Testing done: 1. Manually tested pagination works.

penghuo · 2021-05-27T19:29:56Z

src/main/java/org/opensearch/ad/feature/CompositeRetriever.java

+
+    public class Page {
+        // a map from categorical field name to values (type: java.lang.Comparable)
+        Map<String, Object> afterKey;


Does the listener need to know the afterKey and source? IMO, the listener should only consume the results.

no, the listener does not need to know the afterkey and source.

penghuo · 2021-05-27T19:31:56Z

src/main/java/org/opensearch/ad/feature/CompositeRetriever.java

+                .aggregation(composite)
+                .trackTotalHits(false);
+
+            Page page = new Page(null, searchSourceBuilder, null);


Does the iterable fit to this use case? e.g. class IterablePage extends Iterable

You meant https://docs.oracle.com/javase/8/docs/api/java/lang/Iterable.html ? I need parameters in the next method. Does not seem fit.

I tried to separate page and pageIterator. Please check the new commit. Also, cannot extend JDK interface since the method signature next() does not match my need. I need to return result using async listener callback. Iterator.next() in JDK is a synchronous method.

src/main/java/org/opensearch/ad/feature/CompositeRetriever.java

penghuo · 2021-06-01T18:37:27Z

src/main/java/org/opensearch/ad/feature/CompositeRetriever.java

+                return null;
+            }
+            Aggregation agg = response.getAggregations().get(AGG_NAME_COMP);
+            if (agg == null) {


Consider using Optional to enforece the calller to handle the null case?

yeah, changed to Optional.

…er to a parent class

penghuo

Thanks for the change!

This PR is a conglomerate of the following PRs. #60 #64 #65 #67 #68 #69 #70 #71 #74 #75 #76 #77 #78 #79 #82 #83 #84 #92 #94 #93 #95 kaituo#1 kaituo#2 kaituo#3 kaituo#4 kaituo#5 kaituo#6 kaituo#7 kaituo#8 kaituo#9 kaituo#10 This spreadsheet contains the mappings from files to PR number (bug fix in my AD fork and tests are not included): https://gist.github.com/kaituo/9e1592c4ac4f2f449356cb93d0591167

This PR is a conglomerate of the following PRs. #60 #64 #65 #67 #68 #69 #70 #71 #74 #75 #76 #77 #78 #79 #82 #83 #84 #92 #94 #93 #95 kaituo#1 kaituo#2 kaituo#3 kaituo#4 kaituo#5 kaituo#6 kaituo#7 kaituo#8 kaituo#9 kaituo#10 This spreadsheet contains the mappings from files to PR number (bug fix in my AD fork and tests are not included): https://gist.github.com/kaituo/9e1592c4ac4f2f449356cb93d0591167

kaituo requested review from ylwu-amzn, dai-chen and penghuo May 26, 2021 21:44

penghuo reviewed May 27, 2021

View reviewed changes

src/main/java/org/opensearch/ad/feature/CompositeRetriever.java Show resolved Hide resolved

dai-chen reviewed May 27, 2021

View reviewed changes

src/main/java/org/opensearch/ad/feature/CompositeRetriever.java Outdated Show resolved Hide resolved

src/main/java/org/opensearch/ad/feature/CompositeRetriever.java Show resolved Hide resolved

separate page and pageIterator

4e676f8

penghuo reviewed Jun 1, 2021

View reviewed changes

kaituo added 2 commits June 2, 2021 14:39

Change to use optional

79505d8

Refactor to move common code of SearchFeatureDao and CompositeRetriev…

9e92d33

…er to a parent class

penghuo approved these changes Jun 11, 2021

View reviewed changes

dai-chen approved these changes Jun 11, 2021

View reviewed changes

kaituo closed this Jun 11, 2021

kaituo mentioned this pull request Jul 6, 2021

multi-category support, rate limiting, and pagination #121

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add pagination to HCAD feature query #76

Add pagination to HCAD feature query #76

kaituo commented May 26, 2021

penghuo May 27, 2021

kaituo May 27, 2021

penghuo May 27, 2021

kaituo May 27, 2021

kaituo May 28, 2021

penghuo Jun 1, 2021

kaituo Jun 1, 2021

penghuo left a comment

Add pagination to HCAD feature query #76

Add pagination to HCAD feature query #76

Conversation

kaituo commented May 26, 2021

Description

Check List

penghuo May 27, 2021

Choose a reason for hiding this comment

kaituo May 27, 2021

Choose a reason for hiding this comment

penghuo May 27, 2021

Choose a reason for hiding this comment

kaituo May 27, 2021

Choose a reason for hiding this comment

kaituo May 28, 2021

Choose a reason for hiding this comment

penghuo Jun 1, 2021

Choose a reason for hiding this comment

kaituo Jun 1, 2021

Choose a reason for hiding this comment

penghuo left a comment

Choose a reason for hiding this comment