Job sweeper fails to remove deleted jobs #96

dbbaughe · 2021-11-10T16:54:25Z

From: opendistro-for-elasticsearch/job-scheduler#65

Found from this forum discussion in ISM: https://discuss.opendistrocommunity.dev/t/ism-attempting-to-interact-with-an-obsolete-index/3224

Job scheduler has an in-memory map that contains the scheduled jobs that are scheduled to run. When a job document is created, updated, or deleted this map is updated with the appropriate action. In this specific case the delete somehow failed which left a job that was still executing every 2 hours even though it didn't exist anymore. Ideally the sweeper would catch this and resolve the failure, but the sweeper has a bug where it doesn't remove jobs that were deleted.

For reference:

The sweeper is a background process that sweeps the job indices for job documents to schedule, re-schedule, and de-schedule documents. It does this on an interval defined by the sweep period. Every execution it will sweep all indices that were registered by plugins extending job scheduler which in turn will sweep the shards for each index. This sweepShard function is the one with the bug that is not handling job documents that were deleted from the index.

Comments:

From: @ftianli-amzn
Thanks @dbbaughe for the detailed and well-organized explanation.
Through your description, I think the problem is caused here: when de-scheduling fails, there is no backup to retry or anything else to deal with the failure .

dbbaughe added the enhancement New feature or request label Nov 10, 2021

dbbaughe mentioned this issue Nov 10, 2021

Job sweeper fails to remove deleted jobs opendistro-for-elasticsearch/job-scheduler#65

Closed

peterzhuamazon added this to Engineering Effectiveness Board Jul 11, 2024

github-project-automation bot moved this to 🆕 New in Engineering Effectiveness Board Jul 11, 2024

getsaurabh02 moved this from 🆕 New to Backlog in Engineering Effectiveness Board Jul 18, 2024

peterzhuamazon moved this to 📦 Backlog in Engineering Effectiveness Board Dec 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Job sweeper fails to remove deleted jobs #96

Job sweeper fails to remove deleted jobs #96

dbbaughe commented Nov 10, 2021