Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support multiple category fields #66

Merged
merged 35 commits into from
Aug 20, 2021

Conversation

ohltyler
Copy link
Member

@ohltyler ohltyler commented Jul 29, 2021

Description

Adds support for detectors to have multiple category fields. This is already supported in the backend plugin - see here. Supports both real-time & historical detectors.

Main changes include:

  • updates CategoryField component to support multiple selections in the combo box (up to 2)
  • updates various chart components to show a list of selected category fields:
    1 - In the 'View by' title at the top of the anomalies chart
    2 - On y-axis (each field delimited by a newline)
    3 - When hovering over a heatmap cell. This involved adding a customdata field in the plot data in order to show another
    line of text (see here for more details)
    4- In chart titles
    5 - In anomaly table
  • updates AdditionalSettings in detector config page to support showing multiple category fields
  • corresponding unit tests

Note that the bulk of these changes are related to changing the key of a single category field value that was used before, to making it more generalizable and supporting a list of category fields as a key. This way, we can easily support 3,4,5, etc. category fields without making code changes.

Some changes had to be made to the queries to fetch anomaly results for high-cardinality detectors. Similar to what's mentioned in the paragraph above, the queries were usually contingent on using a single category field as a key for fetching data, which will fetch incorrect results in the multi-category-field scenario. To handle this, several queries had to change:

  1. getTopAnomalousEntitiesQuery(link): to support BWC, we still will fetch single-category-field results the same way as before, and perform a terms aggregation on a single category field. To support multi-category-field results, we use a new query that performs a terms aggregation on a new model_id field, such that each unique set of category field values will be mapped to a single model id. This was introduced in the backend plugin as part of support for multi-category detectors.
  2. getAnomalySummaryQuery (link), getBucketizedAnomalyResultsQuery (link), getEntityAnomalySummariesQuery (link): similar to getTopAnomalousEntitiesQuery, for these functions, the filtering has been factored out into a helper function getEntityFilters which is used to determine if the results should be filtered out by a category field (if only one category field), or model id (if multiple category fields)
  3. getAnomalyResults (link), which is used for fetching the raw results when a heatmap cell is selected, now uses an optionally-passed-in entity list to dynamically generate filters for the category fields before making the call to OpenSearch. Note that model_id is not used here, since we don't store/know the individual model ID when making the call. Also, with future plans to support results using a subset of category fields, this implementation leaves it more flexible for future changes.

Screenshots:

Category field component, with wording changes indicating up to 2 can now be selected:
Screen Shot 2021-08-11 at 3 47 08 PM

Entity list when hovering over heatmap cell:
Screen Shot 2021-07-29 at 10 08 23 AM

Additional settings component:
Screen Shot 2021-07-29 at 10 09 17 AM

Full results page (changes highlighted in red):
screencapture-localhost-5601-app-anomaly-detection-dashboards-2021-07-29-10_08_38

Check List

  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

ohltyler added 28 commits July 28, 2021 08:55
…tmap; Fix bug of entity details wrong when filtering by indiv entity
@ohltyler ohltyler marked this pull request as draft July 29, 2021 17:55
@ohltyler ohltyler marked this pull request as ready for review July 29, 2021 23:14
@ohltyler ohltyler requested review from ylwu-amzn and kaituo July 29, 2021 23:14
@ohltyler ohltyler force-pushed the multi-category-rebased branch from 5f284f0 to de4d6a9 Compare August 12, 2021 23:13
@kaituo
Copy link
Collaborator

kaituo commented Aug 13, 2021

All of the changes look good to me. Wait for you to resolve the PR build failure or add any new commit.

@ohltyler
Copy link
Member Author

All of the changes look good to me. Wait for you to resolve the PR build failure or add any new commit.

Seems identical issue with the CI is happening in main now - likely due to the version bumps in AD backend plugin, and not related to the changes in this PR. Will work on a separate fix for that.

@ohltyler ohltyler added the v1.1.0 Version 1.1.0 label Aug 18, 2021
@ohltyler
Copy link
Member Author

@kaituo I've re-ran again and checks are passing now. No changes were made that I know of; could be flaky tests due to low resources in GitHub runners or changes in dependent packages.

@@ -215,9 +229,25 @@ export const AnomalyHeatmapChart = React.memo(
return false;
};

// Custom hook to refresh all of the heatmap data when running a historical task
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will it cause heavy load if query too much? Maybe we can increase interval between refresh query for HC?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's a fair point. I'm ok to increase the timeout - how about 30s?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, maybe confirm with UX design to check if this is reasonable and should we add some tooltip

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, confirming with UX.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirmed 30s is ok - no extra notices/wording is necessary.

@@ -80,10 +83,19 @@ export const FeatureBreakDown = React.memo((props: FeatureBreakDownProps) => {
const filteredFeatureData = [];
for (let i = 0; i < anomaliesFound.length; i++) {
const currentAnomalyData = anomaliesResult.anomalies[i];
const dataEntityList = get(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the difference between dataEntityList and cellEntityList?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dataEntityList is the associated entity list with the current anomaly data in the for loop. cellEntityList is the entity list pulled from the selected heatmap cell. Note that no logic changes were made here. Instead, entityList used to be get(currentAnomalyData, 'entity', [] as EntityData[])[0].value and cellEntityList used to be props.selectedHeatmapCell.entityValue.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cellEntityList is the entity list pulled from the selected heatmap cell

Does that mean user can choose multiple heatmap cells?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, only one can be selected at a time. But the cell itself can contain a list of entities if it is a multi-category detector

@kaituo
Copy link
Collaborator

kaituo commented Aug 19, 2021

@kaituo I've re-ran again and checks are passing now. No changes were made that I know of; could be flaky tests due to low resources in GitHub runners or changes in dependent packages.

@ohltyler Sounds good. I saw you will make some changes for Yaliang's comments. Will wait for the new commit.

@ohltyler ohltyler merged commit 1b19059 into opensearch-project:main Aug 20, 2021
ohltyler added a commit to ohltyler/anomaly-detection-dashboards-plugin-1 that referenced this pull request Sep 1, 2021
ohltyler added a commit that referenced this pull request Sep 1, 2021
@ohltyler ohltyler added the feature A new feature for the plugin label Sep 2, 2021
@ohltyler ohltyler deleted the multi-category-rebased branch September 2, 2021 21:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature A new feature for the plugin v1.1.0 Version 1.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants