Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add snapshots pending deletion in cluster state to delete snapshot once index is deleted #79156

Open
wants to merge 56 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 9 commits
Commits
Show all changes
56 commits
Select commit Hold shift + click to select a range
9a73ba7
Add snapshots pending deletion in cluster state to delete snapshot wh…
tlrx Oct 14, 2021
579c21f
rename to SnapshotDeletionsPending
tlrx Oct 18, 2021
d15ac48
MAX_PENDING_DELETIONS_SETTING
tlrx Oct 18, 2021
7f9c32c
list
tlrx Oct 18, 2021
88c0a0b
tostring
tlrx Oct 18, 2021
9151c28
remove if is empty
tlrx Oct 18, 2021
12835bb
remove if
tlrx Oct 18, 2021
7a6ec3f
Merge branch 'master' into delete-searchable-snapshot-with-pending-de…
tlrx Oct 18, 2021
7be719b
remove triggered
tlrx Oct 18, 2021
086889d
format conflicting files
tlrx Nov 9, 2021
26b255f
Merge branch 'master' into delete-searchable-snapshot-with-pending-de…
tlrx Nov 9, 2021
77dfe00
nits
tlrx Nov 9, 2021
ea065e9
feedback
tlrx Nov 10, 2021
ef6ca4c
Merge branch 'master' into delete-searchable-snapshot-with-pending-de…
tlrx Nov 10, 2021
91095b7
spotless
tlrx Nov 10, 2021
13a7f69
Merge branch 'master' into delete-searchable-snapshot-with-pending-de…
tlrx Nov 22, 2021
916689e
skip cluster state updates
tlrx Nov 24, 2021
858d4c1
add repo clean up test
tlrx Nov 25, 2021
34de0d0
also clean ups
tlrx Nov 30, 2021
54363a1
Merge branch 'master' into delete-searchable-snapshot-with-pending-de…
tlrx Nov 30, 2021
cf66c46
retries + expiration
tlrx Nov 30, 2021
5800d52
Merge branch 'master' into delete-searchable-snapshot-with-pending-de…
tlrx Nov 30, 2021
e181d4a
missing uuid
tlrx Nov 30, 2021
0411ed7
Merge branch 'master' into delete-searchable-snapshot-with-pending-de…
tlrx Dec 6, 2021
e83860a
Merge branch 'master' into delete-searchable-snapshot-with-pending-de…
tlrx Dec 9, 2021
03efe0b
Merge branch 'master' into delete-searchable-snapshot-with-pending-de…
tlrx Dec 9, 2021
a9384e2
Merge branch 'master' into delete-searchable-snapshot-with-pending-de…
tlrx Jan 12, 2022
3bd59a3
5000
tlrx Jan 12, 2022
4b9080a
order of fields
tlrx Jan 12, 2022
44552c5
Optional.or()
tlrx Jan 12, 2022
bd863e9
remove //TODO
tlrx Jan 12, 2022
8bb56d3
remove curly
tlrx Jan 12, 2022
e541a4c
currents
tlrx Jan 12, 2022
2188a70
no timeout
tlrx Jan 12, 2022
69608bf
pass null
tlrx Jan 12, 2022
fce4365
test assertions
tlrx Jan 12, 2022
df1ec28
clone test
tlrx Jan 13, 2022
2d8b32a
comment
tlrx Jan 13, 2022
025952e
randomInt()
tlrx Jan 13, 2022
ad62175
randomSubsetOf()
tlrx Jan 13, 2022
350ab86
add missing setting
tlrx Jan 13, 2022
d9d711e
add SnapshotDeletionsPendingExecutor
tlrx Jan 14, 2022
2c13687
Merge branch 'master' into delete-searchable-snapshot-with-pending-de…
tlrx Jan 14, 2022
39672ed
unbatched
tlrx Jan 14, 2022
db5fbd6
same error message
tlrx Jan 14, 2022
2b237b0
Merge branch 'master' into delete-searchable-snapshot-with-pending-de…
tlrx Jan 17, 2022
593cf3e
fixes
tlrx Jan 17, 2022
c40f7c1
remove obsolete doc
tlrx Jan 17, 2022
dd1ced9
rename method
tlrx Jan 17, 2022
8f7baa1
Update docs/changelog/79156.yaml
tlrx Feb 2, 2022
ca3e8f5
Merge branch 'master' into delete-searchable-snapshot-with-pending-de…
tlrx Mar 3, 2022
f13eaf6
batch cluster state updates that removes pending deletions
tlrx Mar 3, 2022
7fdbb90
nits
tlrx Mar 3, 2022
5dd6c23
Merge branch 'master' into delete-searchable-snapshot-with-pending-de…
tlrx Mar 14, 2022
82778ae
reset (wip)
tlrx Mar 14, 2022
fb54e4f
update branch
tlrx Mar 14, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,6 @@ protected void masterOperation(
ClusterState state,
final ActionListener<AcknowledgedResponse> listener
) {
snapshotsService.deleteSnapshots(request, listener.map(v -> AcknowledgedResponse.TRUE));
snapshotsService.deleteSnapshotsByName(request, listener.map(v -> AcknowledgedResponse.TRUE));
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -125,6 +125,8 @@ public static List<Entry> getNamedWriteables() {
SnapshotDeletionsInProgress::readDiffFrom);
registerClusterCustom(entries, RepositoryCleanupInProgress.TYPE, RepositoryCleanupInProgress::new,
RepositoryCleanupInProgress::readDiffFrom);
registerClusterCustom(entries, SnapshotDeletionsPending.TYPE, SnapshotDeletionsPending::new,
SnapshotDeletionsPending::readDiffFrom);
// Metadata
registerMetadataCustom(entries, RepositoriesMetadata.TYPE, RepositoriesMetadata::new, RepositoriesMetadata::readDiffFrom);
registerMetadataCustom(entries, IngestMetadata.TYPE, IngestMetadata::new, IngestMetadata::readDiffFrom);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -114,6 +114,15 @@ public boolean hasExecutingDeletion(String repository) {
return false;
}

/**
* Checks if the current {@link SnapshotDeletionsInProgress} contains the given {@link SnapshotId}
*
* @param snapshotId the snapshot id
*/
public boolean contains(SnapshotId snapshotId) {
return getEntries().stream().anyMatch(entry -> entry.getSnapshots().contains(snapshotId));
}

/**
* Returns {@code true} if there are snapshot deletions in progress in the cluster,
* returns {@code false} otherwise.
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,270 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0 and the Server Side Public License, v 1; you may not use this file except
* in compliance with, at your election, the Elastic License 2.0 or the Server
* Side Public License, v 1.
*/

package org.elasticsearch.cluster;

import org.elasticsearch.Version;
import org.elasticsearch.cluster.ClusterState.Custom;
import org.elasticsearch.common.io.stream.StreamInput;
import org.elasticsearch.common.io.stream.StreamOutput;
import org.elasticsearch.common.io.stream.Writeable;
import org.elasticsearch.common.settings.Setting;
import org.elasticsearch.common.settings.Settings;
import org.elasticsearch.snapshots.SnapshotId;
import org.elasticsearch.xcontent.ToXContentObject;
import org.elasticsearch.xcontent.XContentBuilder;

import java.io.IOException;
import java.util.ArrayList;
import java.util.Iterator;
import java.util.List;
import java.util.Objects;
import java.util.function.Consumer;

import static java.util.Collections.unmodifiableList;

/**
* Represents snapshots marked as to be deleted and pending deletion.
*
* Snapshots pending deletion are added to the cluster state when searchable snapshots indices with a specific setting are deleted (see
* MetadataDeleteIndexService#updateSnapshotDeletionsPending()). Because deleting snapshots requires a consistent view of the repository
* they belong to it is not possible to delete searchable snapshots indices and their backing snapshots in the same cluster state update.
*
* Hence we keep in cluster state the snapshot that should be deleted from repositories. To be able to delete them we capture the snapshot
* id, the snapshot name, the repository name and the repository id (if it exists) once, along with the time at which the snapshot was added
* to the pending deletion, in a {@link SnapshotDeletionsPending} entry.
*
*
* When cluster state is updated with such entries the {@link org.elasticsearch.snapshots.SnapshotsService} executes corresponding snapshot
* delete requests to effectively delete the snapshot from the repository. It is possible that the deletion of a snapshot failed for various
* reason (ex: conflicting snapshot operation, repository removed etc). In such cases the snapshot pending deletion is kept in the cluster
* state and the deletion will be retried on the next cluster state update. To avoid too many snapshots pending deletion stored in cluster
* state the number is limited to 500 and configurable through the {@link #MAX_PENDING_DELETIONS_SETTING} setting.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* state the number is limited to 500 and configurable through the {@link #MAX_PENDING_DELETIONS_SETTING} setting.
* state the number is limited to 5000 and configurable through the {@link #MAX_PENDING_DELETIONS_SETTING} setting.

Also, if you do not mind updating the PR description to have the right number.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the PR description and the doc in 3bd59a3

*/
public class SnapshotDeletionsPending extends AbstractNamedDiffable<Custom> implements Custom {

public static final SnapshotDeletionsPending EMPTY = new SnapshotDeletionsPending(List.of());
public static final String TYPE = "snapshot_deletions_pending";

/**
* Setting for the maximum number of snapshots pending deletion allowed in the cluster state.
henningandersen marked this conversation as resolved.
Show resolved Hide resolved
* <p>
* This setting is here to prevent the cluster to grow too large. In the case that the number of snapshots pending deletion exceeds
* the value of this setting the oldest entries are removed from the cluster state. Snapshots that are discarded are removed before
* they can be deleted from their repository and are therefore considered as "leaking" and should be logged as such as warnings.
*/
public static final Setting<Integer> MAX_PENDING_DELETIONS_SETTING = Setting.intSetting(
"cluster.snapshot.snapshot_deletions_pending.size",
500,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With many-shards/indices, 500 feels a bit on the low side. We have seen us run out of that for the indices graveyard. I think we could easily handle 5000 here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I increased this to 5000 as part of 77dfe00.

Setting.Property.NodeScope
);

/**
* A list of snapshots to delete, sorted by creation time
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A small precision improvement (time stamps may not be strictly in order).

Suggested change
* A list of snapshots to delete, sorted by creation time
* A list of snapshots to delete, in the order deletions were requests.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I pushed 77dfe00

*/
private final List<Entry> entries;

private SnapshotDeletionsPending(List<Entry> entries) {
this.entries = unmodifiableList(Objects.requireNonNull(entries));
}

public SnapshotDeletionsPending(StreamInput in) throws IOException {
this(in.readList(Entry::new));
}

@Override
public void writeTo(StreamOutput out) throws IOException {
out.writeList(entries);
}

@Override
public String getWriteableName() {
return TYPE;
}

public boolean isEmpty() {
return entries.isEmpty();
}

public boolean contains(SnapshotId snapshotId) {
return entries.stream().anyMatch(entry -> Objects.equals(entry.getSnapshotId(), snapshotId));
}

public List<Entry> entries() {
return entries;
}

@Override
public XContentBuilder toXContent(XContentBuilder builder, Params params) throws IOException {
builder.startArray(TYPE);
for (Entry entry : entries) {
entry.toXContent(builder, params);
}
builder.endArray();
return builder;
}

@Override
public Version getMinimalSupportedVersion() {
return Version.CURRENT.minimumCompatibilityVersion();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be a fixed version, i.e., Version.V_8_1_0 to avoid streaming this to 8.0 and 7.16?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I pushed 77dfe00

}

public SnapshotDeletionsPending withRemovedSnapshots(List<SnapshotId> snapshotIds) {
if (snapshotIds == null || snapshotIds.isEmpty()) {
return this;
}
boolean changed = false;
final List<Entry> updatedEntries = new ArrayList<>();
for (Entry entry : entries) {
if (snapshotIds.contains(entry.snapshotId)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should build the set of snapshotIds to make this lookup a hash-lookup rather than a linear scan? Just to avoid degenerate cases.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, I changed this in 77dfe00.

changed = true;
continue;
}
updatedEntries.add(entry);
}
if (changed == false) {
return this;
} else if (updatedEntries.isEmpty()) {
return EMPTY;
} else {
return new SnapshotDeletionsPending(updatedEntries);
}
}

@Override
public String toString() {
final StringBuilder builder = new StringBuilder("SnapshotDeletionsPending[");
boolean prepend = true;
final Iterator<Entry> iterator = entries.stream().iterator();
while (iterator.hasNext()) {
if (prepend == false) {
builder.append(',');
}
builder.append(iterator.next());
prepend = false;
}
builder.append(']');
return builder.toString();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is nearly identical output to entries.toString(), except for a space. I wonder if we can use that here instead. OK to leave as is if you prefer this variant (but then I wonder if we need this as a utility).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used this to add extra information when debugging, but it's not needed anymore. I changed to what you suggested as part of 77dfe00.

}

public static class Entry implements Writeable, ToXContentObject {

private final String repositoryName;
private final String repositoryUuid;
private final SnapshotId snapshotId;
private final long creationTime;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if indexDeletionTime was more appropriate?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I renamed it to indexDeletionTime in 77dfe00.


public Entry(String repositoryName, String repositoryUuid, SnapshotId snapshotId, long creationTime) {
this.repositoryName = Objects.requireNonNull(repositoryName);
this.repositoryUuid = Objects.requireNonNull(repositoryUuid);
this.snapshotId = Objects.requireNonNull(snapshotId);
this.creationTime = creationTime;
}

private Entry(StreamInput in) throws IOException {
this.repositoryName = in.readString();
this.repositoryUuid = in.readString();
this.creationTime = in.readVLong();
this.snapshotId = new SnapshotId(in);
}

@Override
public void writeTo(StreamOutput out) throws IOException {
out.writeString(repositoryName);
out.writeString(repositoryUuid);
out.writeVLong(creationTime);
snapshotId.writeTo(out);
}

public String getRepositoryName() {
return repositoryName;
}

public String getRepositoryUuid() {
return repositoryUuid;
}

public SnapshotId getSnapshotId() {
return snapshotId;
}

public long getCreationTime() {
return creationTime;
}

@Override
public boolean equals(Object o) {
if (this == o) return true;
if (o == null || getClass() != o.getClass()) return false;
Entry entry = (Entry) o;
return creationTime == entry.creationTime
&& Objects.equals(repositoryName, entry.repositoryName)
&& Objects.equals(repositoryUuid, entry.repositoryUuid)
&& Objects.equals(snapshotId, entry.snapshotId);
}

@Override
public int hashCode() {
return Objects.hash(repositoryName, repositoryUuid, snapshotId, creationTime);
}

@Override
public XContentBuilder toXContent(XContentBuilder builder, Params params) throws IOException {
builder.startObject();
{
builder.field("repository_name", repositoryName);
builder.field("repository_uuid", repositoryUuid);
builder.timeField("creation_time_millis", "creation_time", creationTime);
builder.field("snapshot", snapshotId);
}
builder.endObject();
return builder;
}

@Override
public String toString() {
return '[' + repositoryName + '/' + repositoryUuid + ',' + snapshotId + ',' + creationTime + ']';
}
}

public static final class Builder {

private final List<Entry> entries;
private final Consumer<Entry> consumer;

public Builder(SnapshotDeletionsPending snapshotDeletionsPending, Consumer<Entry> onLimitExceeded) {
this.entries = new ArrayList<>(snapshotDeletionsPending.entries);
this.consumer = onLimitExceeded;
}

private void ensureLimit(final int maxPendingDeletions) {
while (entries.size() >= maxPendingDeletions) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we risk a deletion of 1000 searchable snapshot indices leaking 500 snapshots? I think users can do DELETE xyz* and this could lead to this? Or just deleting a big data stream. I wonder if we need to ensure that we have tried each pending deletion at least once or that each pending deletion has been around for a minimum amount of time?

Also ties in with the mechanism to defer deletions when the snapshot is being cloned or restored. We would want to have tried at least once, passing those checks, before removing it from this graveyard.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see why we do not want an unlimited list here.

But would we want to fail deletion of indices perhaps if we do too many in one go too fast somehow?

final Entry removed = entries.remove(0);
if (consumer != null) {
consumer.accept(removed);
}
}
}

public Builder add(String repositoryName, String repositoryUuid, SnapshotId snapshotId, long creationTime) {
entries.add(new Entry(repositoryName, repositoryUuid, snapshotId, creationTime));
return this;
}

public SnapshotDeletionsPending build(Settings settings) {
final int maxPendingDeletions = MAX_PENDING_DELETIONS_SETTING.get(settings);
ensureLimit(maxPendingDeletions);
assert entries.size() <= maxPendingDeletions : entries.size() + " > " + maxPendingDeletions;
return entries.isEmpty() == false ? new SnapshotDeletionsPending(entries) : EMPTY;
}
}

public static NamedDiff<Custom> readDiffFrom(StreamInput in) throws IOException {
return readDiffFrom(Custom.class, TYPE, in);
}
}
Loading