Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow Deleting Multiple Snapshots at Once #55474

Merged

Conversation

original-brownbear
Copy link
Member

@original-brownbear original-brownbear commented Apr 20, 2020

Adds deleting multiple snapshots in one go without significantly changing the mechanics of snapshot deletes otherwise.
This change does not yet allow mixing snapshot delete and abort. Abort is still only allowed for a single snapshot delete by exact name.

Copy link
Contributor

@ywelsch ywelsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Armin, I've left some comments.

@@ -167,29 +173,35 @@ public String toString() {
* A class representing a snapshot deletion request entry in the cluster state.
*/
public static final class Entry implements Writeable, RepositoryOperation {
private final Snapshot snapshot;
private final List<SnapshotId> snapshots;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why did you opt for having one entry for multiple snapshots? Should we rather have one entry per snapshot that is to be deleted?

If we build it that way, could it allow us to queue up multiple deletes in the future?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why did you opt for having one entry for multiple snapshots?

I thought about that actually. What I didn't like about it, was that it introduced quite a bit of duplication in what is stored in the cluster state. We'd be adding a bunch of deletes with the same start time, repository and generation to then batch them up again when it comes to running the actual delete.

If we build it that way, could it allow us to queue up multiple deletes in the future?

I don't think that makes things easier for queuing up deletes. My thinking is this:

A delete will always move us from a set of snapshots to another set of snapshots and move the repository generation ahead by 1. It has a defined point in time where it starts (delete entry is processed in the cluster state) and a defined repository state from which it starts. We cannot add to the delete while it is running against the repo as that's an atomic operation also.

So if we were to add deletes by just adding entries, then we'll have to queue up new deletes at a generation higher than the lowest generation of existing deletes (the lowest generation is already processing in the repository so we can't add to it).
=> I figured I'd rather be explicit about how deletes get batched together than do some implicit magic around the repository generations here.
With the way it is now, we can queue up deletes very nicely IMO: If there is only a single delete entry in the CS, we add a new entry because that one is already processing on the repo. If there are multiple delete entries we can add the additional snapshots we want to delete to the second one (that we know isn't processing yet).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, makes sense

snapshotEntry = findInProgressSnapshot(snapshots, snapshotName, repositoryName);
}
} else {
snapshotEntry = null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to throw a Concurrent...Exception in case where there is SnapshotsInProgress.Entry but we did not look at aborting it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do that check in deleteCompletedSnapshots so we don't need it here.
Technically we could add it here as well, but now that we cache RepositoryData it wouldn't do much to duplicate it here (we have to run it again before putting the delete entry into the CS anyway) except for saving a noop CS update task in case of a concurrent snapshot. -> I figured I'd not duplicate logic.

} else {
for (Map.Entry<String, SnapshotId> entry : allSnapshotIds.entrySet()) {
if (Regex.simpleMatch(snapshotOrPattern, entry.getKey())) {
foundSnapshots.add(entry.getValue());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need a set? Isn't there a risk that we're adding the same item multiple times if there are overlapping wildcards?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right ... my bad fixed + added an assertion and test that would've tripped it to cover this.

@original-brownbear
Copy link
Member Author

Thanks Yannick! All points addressed I think

Copy link
Contributor

@ywelsch ywelsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@original-brownbear
Copy link
Member Author

Thanks Yannick!

@original-brownbear original-brownbear merged commit 0da211f into elastic:master Apr 23, 2020
@original-brownbear original-brownbear deleted the multi-delete-snapshot-v2 branch April 23, 2020 10:41
original-brownbear added a commit to original-brownbear/elasticsearch that referenced this pull request May 2, 2020
Adds deleting multiple snapshots in one go without significantly changing the mechanics of snapshot deletes otherwise.
This change does not yet allow mixing snapshot delete and abort. Abort is still only allowed for a single snapshot delete by exact name.
original-brownbear added a commit to original-brownbear/elasticsearch that referenced this pull request May 3, 2020
Disabling BwC Tests so we can merge elastic#55474
original-brownbear added a commit that referenced this pull request May 3, 2020
* Allow Deleting Multiple Snapshots at Once (#55474)

Adds deleting multiple snapshots in one go without significantly changing the mechanics of snapshot deletes otherwise.
This change does not yet allow mixing snapshot delete and abort. Abort is still only allowed for a single snapshot delete by exact name.
russcam added a commit to russcam/elasticsearch that referenced this pull request May 29, 2020
Relates: elastic#55474

This commit updates the snapshot.delete.json REST API spec
to make snapshot a list type, now that it can accept a
list of comma-separated snapshot names
russcam added a commit that referenced this pull request Jun 2, 2020
Relates: #55474

This commit updates the snapshot.delete.json REST API spec
to make snapshot a list type, now that it can accept a
list of comma-separated snapshot names
@original-brownbear original-brownbear restored the multi-delete-snapshot-v2 branch August 6, 2020 19:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants