Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Response Ops][Alerting] Assigning extra large cost to indicator match rule types #189220

Merged
merged 1 commit into from
Jul 26, 2024

Conversation

ymao1
Copy link
Contributor

@ymao1 ymao1 commented Jul 25, 2024

Resolves #189112

Summary

Adds a mapping to the alerting rule type registry to manage rule types with a custom task cost and register appropriately. Adds an integration test to task manager so we can be alerted to task types that register with non-normal task costs.

@ymao1 ymao1 changed the title Adding custom cost to indicator match rules [Response Ops][Alerting] Assigning extra large cost to indicator match rule types Jul 25, 2024
@ymao1 ymao1 self-assigned this Jul 25, 2024
@ymao1 ymao1 added release_note:skip Skip the PR/issue when compiling release notes Feature:Task Manager Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) v8.16.0 labels Jul 25, 2024
@kibana-ci
Copy link
Collaborator

💛 Build succeeded, but was flaky

Failed CI Steps

Metrics [docs]

✅ unchanged

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

cc @ymao1

@ymao1 ymao1 marked this pull request as ready for review July 25, 2024 20:39
@ymao1 ymao1 requested a review from a team as a code owner July 25, 2024 20:39
@elasticmachine
Copy link
Contributor

Pinging @elastic/response-ops (Team:ResponseOps)

@ymao1 ymao1 requested review from pmuellr and js-jankisalvi July 25, 2024 20:39
Copy link
Member

@pmuellr pmuellr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@js-jankisalvi js-jankisalvi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ymao1 ymao1 merged commit b75d74a into elastic:main Jul 26, 2024
47 checks passed
@kibanamachine kibanamachine added the backport:skip This commit does not require backporting label Jul 26, 2024
@ymao1 ymao1 deleted the im-rule-cost branch July 26, 2024 11:43
ymao1 added a commit to ymao1/kibana that referenced this pull request Jul 30, 2024
pmuellr pushed a commit that referenced this pull request Aug 7, 2024
…empt (#189626)

## Summary

Redoing the resource based task claim PR:
#187999 and followup PRs
#189220 and
#189117. Please see the
descriptions of those PRs for more details.

This was original reverted because unregistered task types in serverless
caused the task manager health aggregation to fail. This PR includes an
additional commit to exclude unregistered task types from the health
report:
58eb2b1.

To verify this, make sure you're using the `default` claim strategy,
start up Kibana so that the default set of tasks get created. Then
either disable a bunch of plugins via config:

```
# remove security and o11y
enterpriseSearch.enabled: false
xpack.apm.enabled: false
xpack.cloudSecurityPosture.enabled: false
xpack.fleet.enabled: false
xpack.infra.enabled: false
xpack.observability.enabled: false
xpack.observabilityAIAssistant.enabled: false
xpack.observabilityLogsExplorer.enabled: false
xpack.search.notebooks.enabled: false
xpack.securitySolution.enabled: false
xpack.uptime.enabled: false
```

or comment out the task registration of a task that was previously
scheduled (I'm using the observability AI assistant)

```
--- a/x-pack/plugins/observability_solution/observability_ai_assistant/server/service/index.ts
+++ b/x-pack/plugins/observability_solution/observability_ai_assistant/server/service/index.ts
@@ -89,24 +89,24 @@ export class ObservabilityAIAssistantService {

     this.allowInit();

-    taskManager.registerTaskDefinitions({
-      [INDEX_QUEUED_DOCUMENTS_TASK_TYPE]: {
-        title: 'Index queued KB articles',
-        description:
-          'Indexes previously registered entries into the knowledge base when it is ready',
-        timeout: '30m',
-        maxAttempts: 2,
-        createTaskRunner: (context) => {
-          return {
-            run: async () => {
-              if (this.kbService) {
-                await this.kbService.processQueue();
-              }
-            },
-          };
-        },
-      },
-    });
+    // taskManager.registerTaskDefinitions({
+    //   [INDEX_QUEUED_DOCUMENTS_TASK_TYPE]: {
+    //     title: 'Index queued KB articles',
+    //     description:
+    //       'Indexes previously registered entries into the knowledge base when it is ready',
+    //     timeout: '30m',
+    //     maxAttempts: 2,
+    //     createTaskRunner: (context) => {
+    //       return {
+    //         run: async () => {
+    //           if (this.kbService) {
+    //             await this.kbService.processQueue();
+    //           }
+    //         },
+    //       };
+    //     },
+    //   },
+    // });
   }
```

and restart Kibana. You should still be able to access the TM health
report with the workload field and if you update the background health
logging so it always logs and more frequently, you should see the
logging succeed with no errors:

Below, I've made changes to always log the background health at a 15
second interval:

```
--- a/x-pack/plugins/task_manager/server/plugin.ts
+++ b/x-pack/plugins/task_manager/server/plugin.ts
@@ -236,6 +236,7 @@ export class TaskManagerPlugin
     if (this.isNodeBackgroundTasksOnly()) {
       setupIntervalLogging(monitoredHealth$, this.logger, LogHealthForBackgroundTasksOnlyMinutes);
     }
+    setupIntervalLogging(monitoredHealth$, this.logger, LogHealthForBackgroundTasksOnlyMinutes);
reduce the logging interval


--- a/x-pack/plugins/task_manager/server/lib/log_health_metrics.ts
+++ b/x-pack/plugins/task_manager/server/lib/log_health_metrics.ts
@@ -35,7 +35,8 @@ export function setupIntervalLogging(
     monitoredHealth = m;
   });

-  setInterval(onInterval, 1000 * 60 * minutes);
+  // setInterval(onInterval, 1000 * 60 * minutes);
+  setInterval(onInterval, 1000 * 15);

   function onInterval() {
```

---------

Co-authored-by: kibanamachine <[email protected]>
Co-authored-by: Elastic Machine <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport:skip This commit does not require backporting Feature:Task Manager release_note:skip Skip the PR/issue when compiling release notes Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) v8.16.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Response Ops][Task Manager] Assign extra large cost to indicator match rule type
6 participants