[ML] Addition of the new Model Management tab #115772

darnautov · 2021-10-20T13:49:02Z

Summary

3rd party model support introduces the ability to use PyTorch models trained outside of the Stack. For operational and managing purposes, this PR adds nodes overview with memory breakdown and allocated models info, and updates the model list with actions for deployed models and deployment stats.

Moves trained model to a dedicated top-level nav tab with an experimental badge
Updates layout for the model list
Updates "Type" filter with a model_type, e.g. pytorch, tree_ensemble
Adds "Deployment stats" for pytorch models
Adds nodes list

Checklist

Delete any items that are not applicable to this PR.

Any text added follows EUI's writing guidelines, uses sentence case text and includes i18n support
Documentation was added for features that require explanation or tutorials
Unit or functional tests were updated or added to match the most common scenarios
Any UI touched in this PR is usable by keyboard only (learn more about keyboard accessibility)
Any UI touched in this PR does not create any new axe failures (run axe in browser: FF, Chrome)
This renders correctly on smaller devices using a responsive layout. (You can test this in your browser)
This was checked for cross-browser compatibility

…ent-overview

x-pack/plugins/ml/server/models/memory_overview/memory_overview_service.ts

jgowdyelastic · 2021-10-26T13:36:07Z

x-pack/plugins/ml/server/lib/ml_client/ml_client.ts

@@ -380,6 +380,27 @@ export function getMlClient(
    async getTrainedModelsStats(...p: Parameters<MlClient['getTrainedModelsStats']>) {
      return mlClient.getTrainedModelsStats(...p);
    },
+    // TODO update when the new elasticsearch-js client is available
+    async getTrainedModelsDeploymentStats(...p: Parameters<MlClient['getTrainedModelsStats']>) {


If these endpoints do not appear automatically in the esclient you will need to raise an issue in the client spec repo to request that they get added.

This reverts commit 0cf38fb.

jgowdyelastic · 2021-10-26T14:35:17Z

x-pack/plugins/ml/server/models/data_frame_analytics/models_provider.ts

+               * ML job to run on a given node will do this, and then subsequent ML jobs on the same node will reuse the
+               * same already-loaded code.
+               */
+              memoryRes[key as keyof typeof memoryRes] += NATIVE_EXECUTABLE_CODE_OVERHEAD;


@droberts195 should NATIVE_EXECUTABLE_CODE_OVERHEAD be added to the first job on the node by timestamp or is this order ok where it will be added to AD jobs before DFA, regardless of which types of jobs appeared on the node first.

jgowdyelastic · 2021-10-26T15:20:42Z

x-pack/plugins/ml/server/models/data_frame_analytics/models_provider.ts

+            allocated_models: allocatedModels,
+            memory_overview: {
+              machine_memory: {
+                // @ts-ignore


can this @ts-ignore be removed or a comment added?

elsaticsearch client types haven't been updated yet to support adjusted_total_in_bytes. I'll add a TODO comment

Added in ee2201f

jgowdyelastic · 2021-10-26T15:22:48Z

x-pack/plugins/ml/server/models/data_frame_analytics/models_provider.ts

+      const adMemoryReport = await memoryOverviewService.getAnomalyDetectionMemoryOverview();
+      const dfaMemoryReport = await memoryOverviewService.getDFAMemoryOverview();
+
+      // @ts-ignore


can this @ts-ignore be removed or a comment added?

Fixed in 3b2a39c

jgowdyelastic

LGTM

alvarezmelissa87

Tested and LGTM ⚡

kibanamachine · 2021-10-26T17:39:40Z

💛 Build succeeded, but was flaky

Test Failures

[job] [logs] OSS Misc Functional Tests / telemetry Telemetry service detects that telemetry cannot be sent in screenshot mode

Metrics [docs]

Module Count

Fewer modules leads to a faster build time

id	before	after	diff
`ml`	1687	1698	+11

Async chunks

Total size of all lazy-loaded chunks that will be downloaded as the user navigates the app

id	before	after	diff
`ml`	3.6MB	3.6MB	+11.7KB

Page load bundle

Size of the bundles that are downloaded on every page load. Target size is below 100kb

id	before	after	diff
`ml`	34.5KB	34.6KB	+158.0B

History

💔 Build #1753 failed b667dd6
💚 Build #1701 succeeded f7882b3
💛 Build #1506 was flaky df186e4
💔 Build #1467 failed 4e254e0
💔 Build #1347 failed c149a35
💔 Build #1336 failed 505dd32

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

cc @darnautov

kibanamachine · 2021-10-26T17:40:03Z

💔 Backport failed

The backport operation could not be completed due to the following error:
There are no branches to backport to. Aborting.

The backport PRs will be merged automatically after passing CI.

To backport manually run:
node scripts/backport --pr 115772

This reverts commit 605e9e2.

spalger · 2021-10-26T19:03:41Z

Sorry @darnautov, but this broke types when it was merged because of a conflict with #113950. Please resubmit the PR with the latest master merged and we can get it back in

* [ML] trained models tab * [ML] wip nodes list * [ML] add types * [ML] add types * [ML] node expanded row * [ML] wip show memory usage * [ML] refactor, use model_memory_limit for dfa jobs * [ML] fix refresh button * [ML] add process memory overhead * [ML] trained models memory overview * [ML] add jvm size, remove node props from the response * [ML] fix tab name * [ML] custom colors for the bar chart * [ML] sub jvm size * [ML] updates for the model list * [ML] apply native process overhead * [ML]add adjusted_total_in_bytes * [ML] start and stop deployment * [ML] fix default sorting * [ML] fix types issues * [ML] fix const * [ML] remove unused i18n strings * [ML] fix lint * [ML] extra custom URLs test * [ML] update tests for model provider * [ML] add node routing state info * [ML] fix functional tests * [ML] update for es response * [ML] GetTrainedModelDeploymentStats * [ML] add deployment stats * [ML] add spacer * [ML] disable stop allocation for models with pipelines * [ML] fix type * [ML] add beta label * [ML] move beta label * [ML] rename model_size prop * [ML] update tooltip header * [ML] update text * [ML] remove ts ignore * [ML] update types * remove commented code * replace toast notification service * remove ts-ignore * remove empty panel * add comments, update test subjects * fix ts error * update comment * fix applying memory overhead * Revert "fix applying memory overhead" This reverts commit 0cf38fb. * fix type, remove ts-ignore * add todo comment (cherry picked from commit 605e9e2)

* [ML] Nodes overview for the Model Management page (#115772) * [ML] trained models tab * [ML] wip nodes list * [ML] add types * [ML] add types * [ML] node expanded row * [ML] wip show memory usage * [ML] refactor, use model_memory_limit for dfa jobs * [ML] fix refresh button * [ML] add process memory overhead * [ML] trained models memory overview * [ML] add jvm size, remove node props from the response * [ML] fix tab name * [ML] custom colors for the bar chart * [ML] sub jvm size * [ML] updates for the model list * [ML] apply native process overhead * [ML]add adjusted_total_in_bytes * [ML] start and stop deployment * [ML] fix default sorting * [ML] fix types issues * [ML] fix const * [ML] remove unused i18n strings * [ML] fix lint * [ML] extra custom URLs test * [ML] update tests for model provider * [ML] add node routing state info * [ML] fix functional tests * [ML] update for es response * [ML] GetTrainedModelDeploymentStats * [ML] add deployment stats * [ML] add spacer * [ML] disable stop allocation for models with pipelines * [ML] fix type * [ML] add beta label * [ML] move beta label * [ML] rename model_size prop * [ML] update tooltip header * [ML] update text * [ML] remove ts ignore * [ML] update types * remove commented code * replace toast notification service * remove ts-ignore * remove empty panel * add comments, update test subjects * fix ts error * update comment * fix applying memory overhead * Revert "fix applying memory overhead" This reverts commit 0cf38fb. * fix type, remove ts-ignore * add todo comment (cherry picked from commit 605e9e2) * updates for the latest elasticsearch client * hide allocated models when missing * [ML] Update jest test mock Co-authored-by: Quynh Nguyen <[email protected]>

darnautov added 8 commits October 12, 2021 15:31

[ML] trained models tab

c273a96

[ML] wip nodes list

b953b10

[ML] add types

afecbf3

[ML] add types

07cebda

[ML] node expanded row

72e6aaf

Merge remote-tracking branch 'upstream/master' into ml-114437-deploym…

e8f714c

…ent-overview

Merge remote-tracking branch 'upstream/master' into ml-114437-deploym…

71711cc

…ent-overview

[ML] wip show memory usage

64771e4

darnautov self-assigned this Oct 20, 2021

darnautov added 4 commits October 20, 2021 18:39

[ML] refactor, use model_memory_limit for dfa jobs

eaea57f

[ML] fix refresh button

c5b053a

[ML] add process memory overhead

cccce56

[ML] trained models memory overview

1d2d6dd

darnautov added the buildkite-ci label Oct 20, 2021

darnautov added 10 commits October 21, 2021 13:09

[ML] add jvm size, remove node props from the response

01ed5f7

[ML] fix tab name

282724a

[ML] custom colors for the bar chart

e26d740

[ML] sub jvm size

638cea4

[ML] updates for the model list

8d291d1

Merge remote-tracking branch 'upstream/master' into ml-114437-deploym…

6af4fcc

…ent-overview

[ML] apply native process overhead

3c35976

[ML]add adjusted_total_in_bytes

4e06069

[ML] start and stop deployment

17e501b

Merge remote-tracking branch 'upstream/master' into ml-114437-deploym…

78c4bbe

…ent-overview

jgowdyelastic reviewed Oct 25, 2021

View reviewed changes

x-pack/plugins/ml/server/models/memory_overview/memory_overview_service.ts Outdated Show resolved Hide resolved

darnautov added 5 commits October 25, 2021 12:14

[ML] fix default sorting

505dd32

[ML] fix types issues

c149a35

[ML] fix const

dbc1a9d

[ML] remove unused i18n strings

0fd5e16

[ML] fix lint

11d746e

darnautov added 2 commits October 26, 2021 15:21

remove empty panel

6678e39

add comments, update test subjects

0d9091f

jgowdyelastic reviewed Oct 26, 2021

View reviewed changes

fix ts error

ffeaae4

darnautov requested a review from jgowdyelastic October 26, 2021 13:39

darnautov added 2 commits October 26, 2021 15:59

update comment

83ffa38

fix applying memory overhead

0cf38fb

darnautov requested a review from peteharverson October 26, 2021 14:19

Revert "fix applying memory overhead"

32cef99

This reverts commit 0cf38fb.

jgowdyelastic reviewed Oct 26, 2021

View reviewed changes

darnautov requested a review from jgowdyelastic October 26, 2021 14:39

jgowdyelastic reviewed Oct 26, 2021

View reviewed changes

darnautov added 2 commits October 26, 2021 17:36

fix type, remove ts-ignore

3b2a39c

add todo comment

ee2201f

jgowdyelastic approved these changes Oct 26, 2021

View reviewed changes

alvarezmelissa87 approved these changes Oct 26, 2021

View reviewed changes

darnautov enabled auto-merge (squash) October 26, 2021 16:29

darnautov merged commit 605e9e2 into elastic:master Oct 26, 2021

spalger added a commit that referenced this pull request Oct 26, 2021

Revert "[ML] Nodes overview for the Model Management page (#115772)"

960b037

This reverts commit 605e9e2.

darnautov mentioned this pull request Oct 26, 2021

[ML] Nodes overview for the Model Management page #116361

Merged

darnautov mentioned this pull request Oct 27, 2021

[ML] 3rd party models overview #114438

Closed

lcawl changed the title ~~[ML] Nodes overview for the Model Management page~~ [ML] Addition of the new Model Management tab Nov 1, 2021

lcawl mentioned this pull request Jan 20, 2022

[DOCS} Adds 8.0.0-rc2 release notes #123395

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ML] Addition of the new Model Management tab #115772

[ML] Addition of the new Model Management tab #115772

darnautov commented Oct 20, 2021 •

edited

Loading

jgowdyelastic Oct 26, 2021

jgowdyelastic Oct 26, 2021

jgowdyelastic Oct 26, 2021

darnautov Oct 26, 2021

darnautov Oct 26, 2021

jgowdyelastic Oct 26, 2021

darnautov Oct 26, 2021

jgowdyelastic left a comment

alvarezmelissa87 left a comment

kibanamachine commented Oct 26, 2021

kibanamachine commented Oct 26, 2021

spalger commented Oct 26, 2021

[ML] Addition of the new Model Management tab #115772

[ML] Addition of the new Model Management tab #115772

Conversation

darnautov commented Oct 20, 2021 • edited Loading

Summary

Checklist

jgowdyelastic Oct 26, 2021

Choose a reason for hiding this comment

jgowdyelastic Oct 26, 2021

Choose a reason for hiding this comment

jgowdyelastic Oct 26, 2021

Choose a reason for hiding this comment

darnautov Oct 26, 2021

Choose a reason for hiding this comment

darnautov Oct 26, 2021

Choose a reason for hiding this comment

jgowdyelastic Oct 26, 2021

Choose a reason for hiding this comment

darnautov Oct 26, 2021

Choose a reason for hiding this comment

jgowdyelastic left a comment

Choose a reason for hiding this comment

alvarezmelissa87 left a comment

Choose a reason for hiding this comment

kibanamachine commented Oct 26, 2021

💛 Build succeeded, but was flaky

Test Failures

Metrics [docs]

Module Count

Async chunks

Page load bundle

History

kibanamachine commented Oct 26, 2021

💔 Backport failed

spalger commented Oct 26, 2021

darnautov commented Oct 20, 2021 •

edited

Loading