[Task Manager] Adds a `reschedule` api to Task Manager #50718

gmmorris · 2019-11-14T20:20:38Z

Summary

The is no way, at the moment, to update a task's scheduling.
This is a problem for Alerting, as it means you can't reschedule when the recurring check for an alert will take place.

closes #45152

This is issue is made worse by the fact that Alerting doesn't use Task Manager's interval field and implement its own, which means a new interval only takes affect after the next run of an alert.
Once this PR is merged, we can pick up #46001 which will change alerting to use this new api in order to address this additional issue.

Checklist

Use ~~strikethroughs~~ to remove checklist items you don't feel are applicable to this PR.

~~This was checked for cross-browser compatibility, including a check against IE11~~
~~Any text added follows EUI's writing guidelines, uses sentence case text and includes i18n support~~
Documentation was added for features that require explanation or tutorials
Unit or functional tests were updated or added to match the most common scenarios
~~This was checked for keyboard-only and screenreader accessibility~~

For maintainers

This was checked for breaking API changes and was labeled appropriately
This includes a feature addition or change that requires a release note and was labeled appropriately

elasticmachine · 2019-11-14T21:38:31Z

💚 Build Succeeded

continuous-integration/kibana-ci/pull-request
Commit: 1082f23

elasticmachine · 2019-11-18T14:38:34Z

Pinging @elastic/kibana-stack-services (Team:Stack Services)

elasticmachine · 2019-11-18T15:51:54Z

💚 Build Succeeded

continuous-integration/kibana-ci/pull-request
Commit: 1d78a86

elasticmachine · 2019-11-18T18:05:20Z

💚 Build Succeeded

continuous-integration/kibana-ci/pull-request
Commit: ea0b47a

…ation formats

elasticmachine · 2019-11-18T20:35:09Z

💚 Build Succeeded

continuous-integration/kibana-ci/pull-request
Commit: be1d700

…te-scheduled-task

elasticmachine · 2019-11-19T11:32:40Z

💚 Build Succeeded

continuous-integration/kibana-ci/pull-request
Commit: 2a289cd

…ing logic

elasticmachine · 2019-11-19T18:05:44Z

💔 Build Failed

continuous-integration/kibana-ci/pull-request
Commit: 9220454

elasticmachine · 2019-11-19T18:57:50Z

💚 Build Succeeded

continuous-integration/kibana-ci/pull-request
Commit: fce851d

gmmorris · 2019-11-20T09:14:00Z

x-pack/legacy/plugins/task_manager/task_manager.test.ts

@@ -146,7 +146,7 @@ describe('TaskManager', () => {
    expect(result.id).toEqual('my-foo-id');
  });

-  test('doesnt ignore failure to scheduling existing tasks for reasons other than already being scheduled', async () => {
+  test('doesnt ignore failure to schedule existing tasks for reasons other than already being scheduled', async () => {


The test is unchanged, just its label.

x-pack/legacy/plugins/task_manager/task_store.test.ts

elasticmachine · 2019-11-20T11:45:40Z

💚 Build Succeeded

continuous-integration/kibana-ci/pull-request
Commit: e04cc60

elasticmachine · 2019-11-20T12:21:13Z

💚 Build Succeeded

continuous-integration/kibana-ci/pull-request
Commit: bf24701

mikecote

Changes are looking good! Just a few comments / nits.

x-pack/legacy/plugins/task_manager/task.ts

mikecote · 2019-11-20T13:14:38Z

x-pack/legacy/plugins/task_manager/task_store.ts

+    taskInstanceScheduling: TaskInstanceScheduling
+  ): Promise<ConcreteTaskInstance> {
+    const taskInstance = await this.getTask(taskInstanceScheduling.id);
+    return await this.update(


I'm thinking this section may become a bit more tricky if we want to further reduce the possibility of getting 409 errors. We should evaluate if it's worth worrying about but I'll write some thoughts below.

Some tasks can change state very often and cause such errors. For example, it's possible a task could be updated from elsewhere between the getTask call and update call. Some of those changes include:

Task moving from idle to claiming

Task moving from claiming to running

Task moving from running to idle

If we think this is worth pursuing, some options may include:

Updating only the interval attribute in a separate call without the version attribute (~~guaranteed no 409s~~ Reduction in the possibility of 409s, leaving retry_on_conflict as the next solution if we get there)

Re-attempting a save if ever a 409 has been received (though this could be a repeated error, small odds)

Some other logic about trying to update the runAt

Hopefully if we change things, the implementations are simple.

Edit: I removed the guaranteed no 409s section from my comment as this is not true. They would only happen if ever at a low level a conflict is detected (since Lucene operation is a read then write). retry_on_conflict would be the next way around those.

Yeah, this is what I'm still working on now...
At the moment my blocker is I can't setup a reliable functional test that mimics this... so, it's on going.

I'm not yet sure what the ideal approach would be, but that does raise that we can't skip updating the runAt as that means we might end up with the same problem we have now which is that interval won't be applied until the next runAt expires if get gave us a task with running than then became idle by the time update happened.

This is why my instinct is actually a retry where we can reevaluate which fields to update. Thoughts?

I've introduced a retry on version conflict - but it'll only retry twice, then it'll give up and bubble up the error.

What do you think?

The changes you've done will definitely help reduce the odds of the task not getting an updated interval. I guess if ever it fails to attempt twice, the reschedule function throws an error? Which I think can be used to notify the user of the very small chance something went wrong and to try their request again.

Yes, exactly, if it fails after the second attempt the reschedule call will throw an error.
Whoever is using it will have to handle it.

In the case of the Alerting api that should bubble up to the api reply, but I'll double check.

x-pack/legacy/plugins/task_manager/task_store.ts

x-pack/legacy/plugins/task_manager/task_store.test.ts

pmuellr

code LGTM

made some notes on cleanliness / completeness (reschedule result during running | failed), could be done later if we want to do them ...

pmuellr · 2019-11-20T13:42:23Z

x-pack/legacy/plugins/task_manager/README.md

+If a request is made to `reschedule` the `runAt` field of an `idle` task, irrespective of whether the `interval` field is also specified, the task's field(s) will be updated and the task will only run when the newly specified `runAt` expires (after which, any new `interval` that may also have been specified, will be applied to the _next_ scheduled run).
+These behaviors mirrors how `schedule` treats these two fields.
+
+Where this behavior diverges is if a request is made to `reschedule` a non `idle` task, such as a `running` task or a `failed` task.


The behavior seems acceptable; basically the task is running right now, or will likely again soon (failed so retrying soon), so a new runAt calculation probably isn't needed anyway.

However

recommend using the Task returned by the reschedule api to assess whether the fields have been updated as expected

is kind of a cop-out. I guess you'd compare the original runAt (which you would have to get first) to the one returned, and make some kind of decision (what would you even do, reschedule again?). I think it should probably be easier to decipher for the caller. Augment the result with a property indicating the runAt was not changed, because [was running | was failed | ???].

If that sounds right, would be fine to do as a follow-on PR.

Fair point.
I'll give it some thought... but I'm not sure how we'd augment a result, especially considering none of the other operations have such a thing 🤔I'll see if anything makes sense

Ya, no sense holding up the PR noodling on doing something here, open an issue if you think it's appropriate. I can't imagine we will really need this immediately ...

x-pack/legacy/plugins/task_manager/task_store.test.ts

pmuellr · 2019-11-20T14:20:04Z

x-pack/legacy/plugins/task_manager/task_store.ts

-    retryAt: (doc.retryAt && doc.retryAt.toISOString()) || null,
-    runAt: (doc.runAt || new Date()).toISOString(),
-    status: (doc as ConcreteTaskInstance).status || 'idle',
+    ...mapValues(pick(doc, isPlainObject), objectProp => JSON.stringify(objectProp)),


this is somewhat fragile, in that if we ever added a new object property to the TM SO, it would end up getting JSON.stringify'd, which we wouldn't want. I guess we'd figure it out, as ES would complain that we're giving it a string when it expected an object - but ... who knows! :-)

Also, the previous code seems to be defaulting a bunch of values that the new code isn't ... the non-null ones seem like maybe they're important? attempts and idle? Oh wait, this is only called by taskInstanceToAttributes() which calls applyConcreteTaskInstanceDefaults() first. Perhaps we should just inline the code here in taskInstanceToAttributes() to make that a little more clear?

Also, not sure we even need the dateProp.toISOString(). JSON.stringify() will turn a Date object into an ISO string. Perhaps it's needed elsewhere as the ISO string in other processing tho, and we do need to do it here. If so, a comment would be appropriate.

Regarding the fragility - I thought we were stringifying because we cant store objects there - any idea why we're stringifying at all then?

regarding defaulting- it should be the same as before, but I separate them on purpose because setting defaults and serialising are two different things and it's harder to maintain code that does two unrelated things.
The reason I'm not inlining them is that we need applyConcreteTaskInstanceDefaults independently when we're omitting and merging data from partial updates (see the reschedule method).

As for toISOString - that's what we were already doing, so I kept it the same. I could investigate why we use it specifically, but either way changing it is probably for a separate PR, no?

Oh, is that a SO restriction? You can't have object type properties, with say an enabled: false (to prevent index explosion)? If so, then ya, every object would need to be stringified. If it's not an SO restriction, seems like we shoulda used object enabled=false for these, but then that's water under the bridge, migrating is a possibility, for a future PR :-)

As mentioned, even though it may be fragile if we extend the SO later, I think we'd find out pretty quickly (ES wouldn't index the doc), so not a big deal.

x-pack/legacy/plugins/task_manager/task_store.test.ts

x-pack/test/plugin_api_integration/test_suites/task_manager/task_manager_integration.js

elasticmachine · 2019-11-20T18:25:00Z

💚 Build Succeeded

continuous-integration/kibana-ci/pull-request
Commit: cbfc42c

elasticmachine · 2019-11-21T13:28:59Z

💔 Build Failed

continuous-integration/kibana-ci/pull-request
Commit: ed78933

elasticmachine · 2019-11-21T15:14:00Z

💔 Build Failed

continuous-integration/kibana-ci/pull-request
Commit: 0abcf92

elasticmachine · 2019-11-21T16:47:09Z

💚 Build Succeeded

continuous-integration/kibana-ci/pull-request
Commit: fcbedee

mikecote

Code LGTM!

mikecote

Just had a thought about another scenario I think we have to handle. Whenever we update the interval of a "running" task, this means task manager holds on to an outdated version in one of the Kibana instances. We'll have to make the other end that marks the task as done able to handle 409 errors.

Just writing some notes that we can chat about tomorrow but I think updating tasks that are running will have a problem.

gmmorris · 2019-11-21T22:32:19Z

Just had a thought about another scenario I think we have to handle. Whenever we update the interval of a "running" task, this means task manager holds on to an outdated version in one of the Kibana instances. We'll have to make the other end that marks the task as done able to handle 409 errors.

Just writing some notes that we can chat about tomorrow but I think updating tasks that are running will have a problem.

oh, that's a good shout.
I'll look into it.

gmmorris · 2019-12-12T15:47:51Z

closed as we don't feel we can support this yet.
Work done from runNow should make this easier in the future

feat(update-scheduled-task): Adds getTask api to Task Store

1082f23

gmmorris added Feature:Task Manager v7.6.0 v8.0.0 release_note:enhancement labels Nov 14, 2019

gmmorris added 2 commits November 18, 2019 14:35

feat(update-scheduled-task): Adds reschedule api to Task Manager

201748f

Merge branch 'master' into task-manager/update-scheduled-task

1d78a86

gmmorris added the Team:Stack Services label Nov 18, 2019

gmmorris mentioned this pull request Nov 18, 2019

[DISCUSS] Task manager update API to allow changing a task's interval #45152

Closed

refactor(update-scheduled-task): clean up reschedule code

ea0b47a

refactor(update-scheduled-task): separate default value from serialis…

be1d700

…ation formats

Merge remote-tracking branch 'upstream/master' into task-manager/upda…

2a289cd

…te-scheduled-task

gmmorris added 3 commits November 19, 2019 16:55

feat(update-scheduled-task): simplified reschedule api to reuse exist…

9220454

…ing logic

Merge branch 'master' into task-manager/update-scheduled-task

dbdc3f8

refactor(update-scheduled-task): removed unneeded interval func

fce851d

gmmorris commented Nov 20, 2019

View reviewed changes

x-pack/legacy/plugins/task_manager/task_store.test.ts Show resolved Hide resolved

gmmorris added 2 commits November 20, 2019 09:21

refactor(update-scheduled-task): reuse type

baec6e7

refactor(update-scheduled-task): clean up task serialisation

e04cc60

gmmorris changed the title ~~[DRAFT] Adds a reschedule api to Task Manager~~ [Task Manager] Adds a reschedule api to Task Manager Nov 20, 2019

gmmorris marked this pull request as ready for review November 20, 2019 10:25

gmmorris requested a review from a team November 20, 2019 10:25

gmmorris added 3 commits November 20, 2019 10:46

doc(update-scheduled-task): documented reschedule api in README

6fe4f78

Merge branch 'master' into task-manager/update-scheduled-task

9e4cc56

doc(update-scheduled-task): corrected description of type

bf24701

mikecote reviewed Nov 20, 2019

View reviewed changes

pmuellr approved these changes Nov 20, 2019

View reviewed changes

gmmorris mentioned this pull request Nov 20, 2019

[Alerting] Alerting now uses Task Manager intervals internally #51182

Closed

7 tasks

gmmorris added 3 commits November 20, 2019 16:31

renamed rescheduling type

cd64151

renamed methjods in store to reflect subject

67c651c

explicitly pick out properties in serialisation

cbfc42c

gmmorris added 2 commits November 21, 2019 12:01

introduce retry into Task rescheduling

b691b1e

Merge branch 'master' into task-manager/update-scheduled-task

ed78933

Merge branch 'master' into task-manager/update-scheduled-task

0abcf92

Merge branch 'master' into task-manager/update-scheduled-task

fcbedee

mikecote approved these changes Nov 21, 2019

View reviewed changes

mikecote reviewed Nov 21, 2019

View reviewed changes

gmmorris closed this Dec 12, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Task Manager] Adds a `reschedule` api to Task Manager #50718

[Task Manager] Adds a `reschedule` api to Task Manager #50718

gmmorris commented Nov 14, 2019 •

edited

Loading

elasticmachine commented Nov 14, 2019

elasticmachine commented Nov 18, 2019

elasticmachine commented Nov 18, 2019

elasticmachine commented Nov 18, 2019

elasticmachine commented Nov 18, 2019

elasticmachine commented Nov 19, 2019

elasticmachine commented Nov 19, 2019

elasticmachine commented Nov 19, 2019

gmmorris Nov 20, 2019

elasticmachine commented Nov 20, 2019

elasticmachine commented Nov 20, 2019

mikecote left a comment

mikecote Nov 20, 2019 •

edited

Loading

mikecote Nov 20, 2019

gmmorris Nov 20, 2019

gmmorris Nov 21, 2019

mikecote Nov 21, 2019 •

edited

Loading

gmmorris Nov 21, 2019

pmuellr left a comment

pmuellr Nov 20, 2019

gmmorris Nov 20, 2019

pmuellr Nov 20, 2019

pmuellr Nov 20, 2019

gmmorris Nov 20, 2019

pmuellr Nov 20, 2019

elasticmachine commented Nov 20, 2019

elasticmachine commented Nov 21, 2019

elasticmachine commented Nov 21, 2019

elasticmachine commented Nov 21, 2019

mikecote left a comment

mikecote left a comment

gmmorris commented Nov 21, 2019

gmmorris commented Dec 12, 2019

[Task Manager] Adds a reschedule api to Task Manager #50718

[Task Manager] Adds a reschedule api to Task Manager #50718

Conversation

gmmorris commented Nov 14, 2019 • edited Loading

Summary

Checklist

For maintainers

elasticmachine commented Nov 14, 2019

💚 Build Succeeded

elasticmachine commented Nov 18, 2019

elasticmachine commented Nov 18, 2019

💚 Build Succeeded

elasticmachine commented Nov 18, 2019

💚 Build Succeeded

elasticmachine commented Nov 18, 2019

💚 Build Succeeded

elasticmachine commented Nov 19, 2019

💚 Build Succeeded

elasticmachine commented Nov 19, 2019

💔 Build Failed

elasticmachine commented Nov 19, 2019

💚 Build Succeeded

Choose a reason for hiding this comment

elasticmachine commented Nov 20, 2019

💚 Build Succeeded

elasticmachine commented Nov 20, 2019

💚 Build Succeeded

mikecote left a comment

Choose a reason for hiding this comment

mikecote Nov 20, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mikecote Nov 21, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pmuellr left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

elasticmachine commented Nov 20, 2019

💚 Build Succeeded

elasticmachine commented Nov 21, 2019

💔 Build Failed

elasticmachine commented Nov 21, 2019

💔 Build Failed

elasticmachine commented Nov 21, 2019

💚 Build Succeeded

mikecote left a comment

Choose a reason for hiding this comment

mikecote left a comment

Choose a reason for hiding this comment

gmmorris commented Nov 21, 2019

gmmorris commented Dec 12, 2019

[Task Manager] Adds a `reschedule` api to Task Manager #50718

[Task Manager] Adds a `reschedule` api to Task Manager #50718

gmmorris commented Nov 14, 2019 •

edited

Loading

mikecote Nov 20, 2019 •

edited

Loading

mikecote Nov 21, 2019 •

edited

Loading