-
Notifications
You must be signed in to change notification settings - Fork 8.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Response Ops] Keep task document when enabling/disabling rules #139826
Conversation
updatedBy: await this.getUserName(), | ||
updatedAt: new Date().toISOString(), | ||
}), | ||
{ version } | ||
); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This update to the disable logic takes into account that there may be pre 8.0 rules running where the scheduled task ID does not equal the rule ID. In these cases, we still want to remove the task document so that a new one matching the rule id can be created on enable.
If the scheduledTaskId already matches the rule ID, this will set the task to disabled.
@@ -2035,6 +2035,23 @@ export class RulesClient { | |||
} catch (e) { | |||
throw e; | |||
} | |||
} | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This update to the enable logic takes into account the fact that there may be already disabled rules with no corresponding task.
If the rule is disabled with no corresponding task, it's scheduledTaskId will be undefined. In this case, we want to schedule a task on enable.
If a rule somehow has a scheduledTaskId defined but it doesn't actually exist, we want to schedule a task on enable.
Finally, if a rule has a scheduledTaskId defined and the task exists, we enable the task.
@elasticmachine merge upstream |
Pinging @elastic/response-ops (Team:ResponseOps) |
@elasticmachine merge upstream |
@elasticmachine merge upstream |
@elasticmachine merge upstream |
@elasticmachine merge upstream |
I'm not sure of the pressing need for this any more, but we have been using "disable then re-enable" as a way of "resetting" a rule's task state. Feels like we may still need this, probably just for problematic cases. Should we open a follow-on PR, or do we even need this, or are there alternatives? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Tested locally, works as expected.
@pmuellr We discussed this in the issue: #110096 (comment) and it seems like the consensus is that losing task state on disable is considered a bug that we're fixing. How often are we suggesting disable/reenable to clear task state? |
@pmuellr Re-read your question and I think you're asking if we should open a followup issue for being able to reset task state, which I definitely can :) |
Yes, basically what I'm asking :-) I know the typical use case WE use it for is probably just basic sync issues between the rule and task. I'm not sure folks have really used it JUST to reset the task state - for example, if there's some bad rule-specific state, or to "reset all the alerts" or something. I don't think we've really seen a need for that. Which is all a new "reset task" API would do, I would think. And wasn't sure if there was already some other way of accomplishing that anyway. Sounds like it's something we should consider though, so opening an issue to track seems appropriate. |
@elasticmachine merge upstream |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM; left some questions ...
x-pack/test/alerting_api_integration/common/lib/task_manager_utils.ts
Outdated
Show resolved
Hide resolved
💚 Build Succeeded
Metrics [docs]History
To update your PR or re-run it, just comment with: cc @ymao1 |
Resolves #110096
🚨🚨🚨 A migration has been added to add
enabled: true
to existing task manager documents withstatus: 'claiming' | 'idle' | 'running'
. It setsenabled: false
to tasks withstatus: 'failed' | 'unrecognized'
This might warrant extra review :) 🚨🚨🚨Summary
This PR adds an
enabled
field to all task documents. This is a boolean indicating whether the task is currently enabled or disabled. The task claim query has been updated to only claim enabled tasks.Task manager changes:
taskSchedule.schedule
andtaskSchedule.bulkSchedule
have been updated to setenabled: true
for newly scheduled tasks (unlessenabled: false
is explicitly set when scheduling.taskSchedule.bulkEnableDisable
to update this flag for specific task idsRules client changes:
disable
function to stop deleting the underlying task manager document when a rule is disabled (with one exception, left comment on the PR). Instead, it updates theenabled
flag on the task document to befalse
.enable
function to stop scheduling a new task when a rule is enabled (except when no task exists). Instead, it updates theenabled
flag on the task document to betrue
.To Verify
failed
. Create some rules and disable at least one of them.enabled
field. Verify that failed tasks haveenabled: false
but other tasks haveenabled: true
. Verify that the rules that were enabled before are still running.enabled
flag on the task.Checklist