Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ILM delete phase deletes searchable snapshot before index #116801

Open
DaveCTurner opened this issue Nov 14, 2024 · 4 comments
Open

ILM delete phase deletes searchable snapshot before index #116801

DaveCTurner opened this issue Nov 14, 2024 · 4 comments
Labels
>bug :Data Management/ILM+SLM Index and Snapshot lifecycle management Team:Data Management Meta label for data/management team

Comments

@DaveCTurner
Copy link
Contributor

When ILM deletes a searchable snapshot index, it deletes the snapshot first, waits for that to complete, and then deletes the index. This means that there's a period of time when the index still exists but attempts to search it may fail with a org.elasticsearch.snapshots.SnapshotMissingException.

CleanupSnapshotStep cleanupSnapshotStep = new CleanupSnapshotStep(cleanSnapshotKey, deleteStepKey, client);
DeleteStep deleteStep = new DeleteStep(deleteStepKey, nextStepKey, client);

We should delete the index first, and only delete the snapshot once the index deletion is complete.

Relates #116379 which can prevent the snapshot deletion from completing properly, leaving the index in a broken state for a long time.

@DaveCTurner DaveCTurner added :Data Management/ILM+SLM Index and Snapshot lifecycle management >bug labels Nov 14, 2024
@elasticsearchmachine elasticsearchmachine added the Team:Data Management Meta label for data/management team label Nov 14, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-data-management (Team:Data Management)

@dakrone
Copy link
Member

dakrone commented Nov 14, 2024

This is an unfortunate byproduct of the way that ILM works. If we were to delete the index prior to the snapshot, then ILM would be unmoored from any execution, and it wouldn't then proceed to remove the snapshot. We could try to do the delete and snapshot delete in the same step, but we still risk hanging snapshots during failure.

Originally, I remember that quite a while ago @tlrx was looking into something where deleting the snapshot also deleted the index at the same time, or maybe it was a way to delete the index with deleting the snapshot at the same time, @tlrx did anything come out of that?

@tlrx
Copy link
Member

tlrx commented Nov 14, 2024

@dakrone attempt is #79156 which wasn't completed due to complexity and other priorities. The idea was to delete the snapshot once the mounted index is deleted, if ILM delete_snapshot was set to true.

@DaveCTurner
Copy link
Contributor Author

I left a comment about a possibly-simpler approach.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Data Management/ILM+SLM Index and Snapshot lifecycle management Team:Data Management Meta label for data/management team
Projects
None yet
Development

No branches or pull requests

4 participants