Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Delete backing snapshot when searchable snapshot index is deleted #75565

Closed

Conversation

tlrx
Copy link
Member

@tlrx tlrx commented Jul 21, 2021

In #74977 we introduced a new index setting index.store.snapshot.delete_searchable_snapshot that can be set when mounting a snapshot as an index to inform that the snapshot should be deleted once the searchable snapshot index is deleted.

The previous pull request adds the index setting and the verifications around it. This pull request now adds the logic to detect that a searchable snapshot index with this specific logic is being deleted and triggers the deletion of the backing snapshot.

In order to do this, when a searchable snapshot index is deleted we check if the setting index.store.snapshot.delete_searchable_snapshot is set. If the index to be deleted is the last searchable snapshot index that uses the snapshot then the snapshot informations are added to the RepositoryMetadata in the cluster state in a list of "snapshots to delete". Once a snapshot is marked as "to delete" and appears in the repository metadata the snapshot cannot be cloned, mounted or restored in the cluster.

Snapshots marked as "to delete" are deleted in the background by the SnapshotsService. On cluster state updates the SnapshotsService retrieve the list of snapshots to delete and trigger the deletion by executing an explicit snapshot delete request (I tried to sneak into the snapshot state machine to directly add the snapshots as SnapshotDeletionInProgress but this requires a consistent view of the repository that is not available at the time the snapshots are marked as to delete).

Deletions of snapshots marked as "to delete" are executed per repository with some limitations to avoid hundreds of snapshots being deleted concurrently.

This PR is split into separate commits that should be reviewable on their own.

Next steps will be to make use of this setting in ILM so that it does not require a specific "delete searchable snapshot" action but instead rely on the mechanism implemented here to clean up snapshots; and finally prevent deletion of snapshots used by mounted indices (this will need BWC support).

@tlrx tlrx added >enhancement :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs v8.0.0 labels Jul 21, 2021
@elasticmachine elasticmachine added the Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. label Jul 21, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

@tlrx tlrx added the v7.15.0 label Jul 21, 2021
@tlrx tlrx requested review from original-brownbear and DaveCTurner and removed request for original-brownbear July 21, 2021 10:01
@tlrx
Copy link
Member Author

tlrx commented Aug 2, 2021

@elasticmachine run elasticsearch-ci/docs

@tlrx tlrx removed the v7.15.0 label Aug 2, 2021
@tlrx
Copy link
Member Author

tlrx commented Aug 2, 2021

@original-brownbear @DaveCTurner I know this is a big pull request and you are busy with other things but I'd appreciate your feedback once you find the time to give one :)

Copy link
Contributor

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @tlrx, sorry it's taken a while to get around to this. I left a few comments that I hope will help to simplify it a bit.

Do you think we should hide the snapshots-to-be-deleted from APIs like get-snapshots and get-snapshot-status?

continue; // other index is backed by a different snapshot, skip
}
final String otherRepositoryName = repositoryNameFromIndexSettings(currentState, otherSettings);
if (Objects.equals(repositoryName, otherRepositoryName) == false) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this might break in some odd corner cases involving registering the same repository under multiple names (maybe without repository UUIDs). But do we need to check this? By this point we know the snapshot UUID matches, that should be enough to tell us not to delete it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree and I removed this check in e1995d7

boolean changed = false;
for (Map.Entry<String, Set<SnapshotId>> snapshotToDelete : snapshotsToDelete.entrySet()) {
RepositoryMetadata repository = repositories.repository(snapshotToDelete.getKey());
if (repository != null) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be null?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it can be null and actually we don't need to re read the repository metadata here so I pushed e1995d7

private final Map<String, Integer> numberOfOnGoingSnapshotsToDeletes = new HashMap<>();

private volatile int maxConcurrentSnapshotsToDeletes;
private volatile TimeValue snapshotsToDeleteRetryInterval;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A time-based retry feels wrong to me. We retry when concurrent operations block our delete attempt, which is ok, but the blocking operations are removed by a cluster state update so I think we should make some attempt to detect conflicting operations and retry when applying a cluster state update that appears to have no conflicts rather than by using a timer.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Detecting conflicting situations and not trigger snapshot deletions sounds like a good idea, thanks. I refactored the deletion logic in 3e8dc65 to detect conflicts and do nothing but rely on subsequent cluster state updates to allow and to trigger snapshot deletions.

Still, I went back and forth with the time base retry. I agree this should not be the main path but I think that it will be useful in some situations, like the snapshot deletions failing because of expired repository or network issue when accessing the repository. I made an infinite retry with 30s delay but we could imagine some backoff policy or just a max number of retries before giving up.

*/
private final Map<String, Integer> numberOfOnGoingSnapshotsToDeletes = new HashMap<>();

private volatile int maxConcurrentSnapshotsToDeletes;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We basically coalesce all snapshot deletions on a repository into a single operation (see e.g. the reusedExistingDelete flag) so I think it would be preferable, and simpler, not to have this kind of concurrency.

Copy link
Member Author

@tlrx tlrx Sep 6, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, thanks for reminding me this. I've refactored and removed this "concurrency".

try {
logger.debug("[{}] triggering deletion of snapshot [{}]", repository, snapshotId);
final PlainActionFuture<Void> future = PlainActionFuture.newFuture();
deleteSnapshots(new DeleteSnapshotRequest(repository, snapshotId.getName()), future);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're relying on the name of the snapshot here, but I don't think we protect against anyone deleting the snapshot manually and then creating a new one with the same name while this potentially-delayed action is in flight, so we might end up deleting the wrong snapshot here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now #76079 is merged I find it easier to implement and I pushed 4df53d7.

logger.debug("[{}] triggering deletion of snapshot [{}]", repository, snapshotId);
final PlainActionFuture<Void> future = PlainActionFuture.newFuture();
deleteSnapshots(new DeleteSnapshotRequest(repository, snapshotId.getName()), future);
future.actionGet();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we don't need to block a thread to wait for completion here, we should be able to handle the response async.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++ I reworked this.

/**
* List of {@link org.elasticsearch.snapshots.SnapshotId} marked as "to delete"
*/
private final List<SnapshotId> snapshotsToDelete;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just realised that if we track the pending deletes it here then it'll get lost if someone does a put or delete on this repo, maybe just to adjust its settings or rename it or something. Maybe we should use a separate Metadata.Custom instead.

Copy link
Member Author

@tlrx tlrx Aug 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought I took care of propagating the pending deletes in case of a put on the repository but actually you're right, there's a missing bit I think (see canUpdateInPlace(newRepositoryMetadata, existing) in RepositoriesService).

Putting the pending deletes in a separate Custom also makes sense but we have to decide how to match the repository when it's re-created (by uuid?) and how long should the pending deletes be kept in cluster state. I think it would be easier to keep the pending deletes where it is today and prevent the deletion of a repository if it still have pending deletes (like we prevent repo deletion if there are ongoing operations).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++ there's no easy answer. Blocking the removal of the repository would worry me a bit, if we introduced some awful bug that made deletes fail then we could be retrying forever without a way to fix it. I guess we'd also want to prevent marking a repo as readonly for similar reasons. Maybe we'd need a ?force parameter to the delete repo API as an escape hatch just in case.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I refactored things a bit and I added some specific checks for read-only repositories so that we don't trigger snapshot deletions on such repositories. Pending deletes are still added to the repository metadata so that they'll be processed once the repository is writeable again.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I did not implement the logic to prevent deleting repository (or making it read-only) when they have pending deletes. I'd like to discuss and address this as a follow up if it is ok for you.

@tlrx
Copy link
Member Author

tlrx commented Sep 6, 2021

@DaveCTurner Thanks for your feedback and sorry for the time it took me to apply your comments to the code. I've reimplemented the deletion logic which should be better now but maybe not yet bullet proof.

Do you think we should hide the snapshots-to-be-deleted from APIs like get-snapshots and get-snapshot-status?

I'm tempted to not hide the pending snapshots deletions in these APIs so that they reflect what exist exactly in the repository. Maybe we could add a runtime flag to the response that will indicate pending deletions.

@tlrx tlrx requested a review from DaveCTurner September 6, 2021 15:46
@tlrx
Copy link
Member Author

tlrx commented Sep 14, 2021

ping @DaveCTurner - sorry for the long standing PR but if you have the time to have another look I'd be grateful.

Copy link
Contributor

@henningandersen henningandersen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a number of minor comments, but the main ones are around the identity of the deletion and where to store it.
I wish we could rely on repository uuids, but I suppose we cannot due to bwc?

.custom(RepositoriesMetadata.TYPE, RepositoriesMetadata.EMPTY)
.repository(repositoryName);
if (repositoryMetadata != null && repositoryMetadata.snapshotsToDelete().contains(sourceSnapshotId)) {
throw new ConcurrentSnapshotExecutionException(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++, I wonder if we want similar protection in TransportMountSearchableSnapshotAction?

@@ -217,7 +217,7 @@ public ClusterState execute(ClusterState currentState) {
updatedMetadata = repositoryMetadata.withSettings(newRepositoryMetadata.settings());
} else {
ensureRepositoryNotInUse(currentState, request.name());
updatedMetadata = newRepositoryMetadata;
updatedMetadata = newRepositoryMetadata.withSnapshotsToDelete(repositoryMetadata.snapshotsToDelete());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could update the repo-reference to point to a completely different repo. I wonder if we should not registere the repository uuid together with the snapshot to delete - and then remove the snapshots to delete if the uuid does not match when it is assigned later?
Also relates to David's comment in RepositoryMetadata, keeping the list of snapshots to delete separate from the repo-registration sort of makes sense I think.

Comment on lines +427 to +443
final String suffix = getTestName().toLowerCase(Locale.ROOT);
final String repository = "repository-" + suffix;
createRepository(repository, FsRepository.TYPE, randomRepositorySettings());

final String index = "index-" + suffix;
createAndPopulateIndex(index, Settings.builder().put(INDEX_SOFT_DELETES_SETTING.getKey(), true));

final TotalHits totalHits = internalCluster().client().prepareSearch(index).setTrackTotalHits(true).get().getHits().getTotalHits();

final String snapshot = "snapshot-" + suffix;
createSnapshot(repository, snapshot, List.of(index));
assertAcked(client().admin().indices().prepareDelete(index));

final String mounted = mountSnapshot(repository, snapshot, index, deleteSnapshotIndexSettings(true));
assertHitCount(client().prepareSearch(mounted).setTrackTotalHits(true).get(), totalHits.value);
assertAcked(client().admin().indices().prepareDelete(mounted));

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this first part is the same across several tests, perhaps we can refactor out a setup method?

return repository.name();
}
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should return null in case there is a repo-uuid on the index but no such repo was found? This works differently from RepositoriesService.indexSettingsMatchRepositoryMetadata

if (snapshotsToDelete.isEmpty() == false) {
RepositoriesMetadata repositories = currentState.metadata().custom(RepositoriesMetadata.TYPE, RepositoriesMetadata.EMPTY);
for (Map.Entry<String, Set<SnapshotId>> snapshotToDelete : snapshotsToDelete.entrySet()) {
repositories = repositories.addSnapshotsToDelete(snapshotToDelete.getKey(), snapshotToDelete.getValue());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we risk getting an IllegalArgumentException from RepositoriesMetadata.withUpdate when the repo has been deleted while deleting the index.

*/
private void deleteSnapshots(
final Function<SnapshotId, String> mapping,
final DeleteSnapshotRequest request,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to let this method have the raw parameters rather than use the request, since in the uuid case it is sort of bending the original purpose of DeleteSnapshotRequest.

),
e
);
} else if (e instanceof RepositoryMissingException) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we want to retry when/if the repo reappears? Ties in with the discussion on where the list of snapshots to delete lives.

snapshotId
);
} else {
logger.warn(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am semi-torn on this. We log at warn, but retry every 30s. If we think we can recover via retries, perhaps we should be more silent (debug) about it?

logger.trace("snapshot to delete [{}] is being cloned, waiting for operation to complete", snapshotId);
} else if (currentDeletions.contains(snapshotId)) {
logger.trace("snapshot to delete [{}] is already queued", snapshotId);
} else if (onGoingSnapshotsDeletions.add(snapshotId)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: ongoingSnapshotDeletions?

Comment on lines +207 to +208
assertThat(updatedRepos.repository("repo_name").snapshotsToDelete(), hasSize(1));
assertThat(updatedRepos.repository("repo_name").snapshotsToDelete(), hasItem(new SnapshotId("snap_name", "snap_uuid")));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be:

assertThat(updatedRepos.repository("repo_name").snapshotsToDelete(), equalTo(List.of(new SnapshotId("snap_name", "snap_uuid"))));

@tlrx
Copy link
Member Author

tlrx commented Oct 15, 2021

Thanks @DaveCTurner and @henningandersen. I've opened #79156 to see how it looks like when snapshots pending deletion are stored as dedicated custom in cluster state. With Henning's suggestion to limit the number of pending deletions to keep around I think it makes more sense and it's easier to deal with situations where the repository is gone.

@arteam arteam added v8.1.0 and removed v8.0.0 labels Jan 12, 2022
@mark-vieira mark-vieira added v8.2.0 and removed v8.1.0 labels Feb 2, 2022
@elasticsearchmachine
Copy link
Collaborator

Hi @tlrx, I've created a changelog YAML for you.

@elasticsearchmachine elasticsearchmachine changed the base branch from master to main July 22, 2022 23:10
@mark-vieira mark-vieira added v8.5.0 and removed v8.4.0 labels Jul 27, 2022
@tlrx
Copy link
Member Author

tlrx commented Aug 3, 2022

Closing this one, a better implementation is #79156

@tlrx tlrx closed this Aug 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs >enhancement Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. v8.5.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants