-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Transform] Make force-stopping the transform always remove persistent task from cluster state #106989
[Transform] Make force-stopping the transform always remove persistent task from cluster state #106989
Conversation
770d234
to
7af527f
Compare
…t task from cluster state
7af527f
to
db7397f
Compare
Pinging @elastic/ml-core (Team:ML) |
Hi @przemekwitek, I've created a changelog YAML for you. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent, great job debugging this
transformTask.shutdown(); | ||
// Here the indexer is aborted so that its thread finishes work ASAP. | ||
transformTask.onCancelled(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are no longer needed because persistentTasksService.sendRemoveRequest
will invoke them, is that correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the sequence of events is as follows:
persistentTasksService.sendRemoveRequest
removes the locally running task and notifiesPersistentTasksClusterService
to remove the task from cluster state.- when the cluster state changes,
PersistentTaskNodeService
is notified viaclusterChanged
method PersistentTaskNodeService
cancels the taskTransformTask.onCancelled
method is called and this method callsshutdown
## Summary Fixes #180503 Fixes #180499 Fixes #180495 Fixes #180496 Fixes #180497 Fixes #180504 The tests themselves were passing, but the cleanup in the `after` blocks turned out problematic after a [recent update in ES related to stopping transforms](elastic/elasticsearch#106989) using `force`. Originally we called `stop` on all test transforms and then called the helper function `.cleanTransformIndices();`. For some time now this helper was updated to first both force stop and delete transforms before deleting related indices. So we ended up calling stop twice in some tests! The second stop using force could then return an error if the transform was already stopped. This PR fixes the tests by removing the now unnecessary stop commands. ### Checklist - [x] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios - [x] [Flaky Test Runner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was used on any tests changed - [x] This was checked for breaking API changes and was [labeled appropriately](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)
Currently, if, for any reason, persistent transform task removal cannot commit to cluster state, the leftover persistent task entry will stay forever in the cluster state.
This PR fixes the
_stop/force
API so that persistent transform task removal is possible in such a case.Rather than relying on
taskOperation()
method (which might not be called if the task is in this problematic state), this PR makes the force-stop action remove the persistent task as soon as possible in thedoExecute
method (before the delegation totaskOperation()
method).Fixes #106811