Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Partial Snapshots Recording Spurious Errors (#69150) #69372

Merged
merged 1 commit into from
Feb 22, 2021

Conversation

original-brownbear
Copy link
Member

If an index is deleted while a partial snapshot is running the behavior
was not deterministic.
If an index was deleted just as one of its shard snapshots was about to
start then it would be recorded as a shard snapshot failure in the
snapshot result and the snapshot would show up as PARTIAL.
If the index delete however happened after the shard had been
snapshotted, then the snapshot would show SUCCESS.
In both cases however, the snapshot would contain the exact same data
because the deleted index would become part of the final snapshot.
Also, it was confusing that in the PARTIAL case, there would be errors
recorded for shards the indices of which would not be part of the
snapshot.

This commit makes it such that not only are indices filtered from the
list of indices in a snapshot but also from the shard snapshot errors
in a snapshot entry so that the snapshot always shows up as SUCCESS
because concurrent index deletes are not a failure but allowed in
partial snapshots.

Closes #69014

backport of #69150

If an index is deleted while a partial snapshot is running the behavior
was not deterministic.
If an index was deleted just as one of its shard snapshots was about to
start then it would be recorded as a shard snapshot failure in the
snapshot result and the snapshot would show up as `PARTIAL`.
If the index delete however happened after the shard had been
snapshotted, then the snapshot would show `SUCCESS`.
In both cases however, the snapshot would contain the exact same data
because the deleted index would become part of the final snapshot.
Also, it was confusing that in the `PARTIAL` case, there would be errors
recorded for shards the indices of which would not be part of the
snapshot.

This commit makes it such that not only are indices filtered from the
list of indices in a snapshot but also from the shard snapshot errors
in a snapshot entry so that the snapshot always shows up as `SUCCESS`
because concurrent index deletes are not a failure but allowed in
partial snapshots.

Closes elastic#69014
@original-brownbear original-brownbear added :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs backport labels Feb 22, 2021
@elasticmachine elasticmachine added the Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. label Feb 22, 2021
@original-brownbear original-brownbear merged commit 195760d into elastic:7.x Feb 22, 2021
@original-brownbear original-brownbear deleted the 69150-7.x branch February 22, 2021 18:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants