Make snapshot deletion faster #61513

piyushdaftary · 2020-08-25T07:14:27Z

Elasticsearch version (bin/elasticsearch --version): 7.6 onwards

JVM version (java -version): Java 14

OS version (uname -a if on a Unix-like system): CentOs

In Elasticsearch, snapshot deletion is a multithreaded synchronous master node operation. The sequence of the delete operation goes as follows:

1. Master node receives snapshot deletion request and registers a listener for snapshot deletion.
2. If snapshot (for which delete request is received) is in progress, stop the shard snapshot 
3. Update the cluster state with snapshot deletion entry (SnapshotDeletionsInProgress).
4. Fetch snapshot entry from s3 repository.
5. Get list of all children folder in repository under indices folder path
6. Update the shard state metadata in repository for all shards of the snapshot that is to be deleted and compute shards to be deleted from repository
7. Remove the snapshot from the list of existing snapshots in repository
8. Update new generations: 
    1. Update the index shard generations (new  generations) of all updated shard folders to next generation in Repository 
    2. Update the index-N to this new  generations in repository and make index.latest point to this new Index-N
    3. Update the new  generations in cluster state 
9. Run Cleanup operation 
    1. Delete repository root level snap-UUID and meta-UUID file of deleting snapshot 
    2. Delete stale indices folder from repository which are not referenced by any snapshots
    3. List shard level list of files to be deleted from repository
    4. Remove shard level files (in bulk delete fashion) from repository that are not referenced 
10. Update the cluster state by removing the snapshot entry and call back on the listeners
11. Listener called back which in turn responds back to the user indicating snapshot deletion is complete

Current implementation of "step 9.2 : Delete stale indices" : cleanupStaleIndices() is very slow . Snapshot deletion code deletes each stale indices from repository one after another synchronously .

With current implementation we tried to measure time taken to delete a snapshot for a cluster with 3 masters of type r5.12xlarge and 50 data nodes of type i3.4xlarge with 1601 indices and 8001 shards and 4.8 TB data. IT takes approximately 31 minutes to delete such snapshot.

	Shards #	Indices #	Snapshot Creation Time(Avg)	Snapshot Deletion Time(Avg)
Current Implementation	8001	1601	8.4 min	31.1 min

Current flow diagram of cleanupStaleIndices :

This step to cleanup stale indices can be speed up with either of the following approaches :

Suggested Optimizations

Approach1 :

Instead of making snapshot delete of stale indices a single threaded operation we make it multithreaded operation and delete multiple stale indices in parallel using SNAPSHOT thread pool's workers.
When deletion of all the stale indices are complete , we return back the DeleteResult as response of method cleanupStaleIndices()

With above Approach1 time taken to delete a snapshot for similar cluster with 3 masters of type r5.12xlarge and 50 data nodes of type i3.4xlarge with 1601 indices and 8001 shards and 4.8 TB data. IT takes approximately only 9.8 minutes to delete such snapshot.

I noted system resource utilizations such is CPU, system memory of master node. There was no major change in resource utilizations in Approach1 compared to Current implementation

Approach1 Optimization Vs Current Implementation comparison :

	Shards #	Indices #	Snapshot Creation Time(Avg)	Snapshot Deletion Time(Avg)
Approach1 Optimization	8001	1601	6.9 min	9.86 min
Current Implementation	8001	1601	6.8 min	31.1 min

Approach1 Optimization flow diagram of cleanupStaleIndices :

Approach2 :

Instead of deleting snapshot stale indices synchronously, we can completely make the method cleanupStaleIndices() asynchronous . When method is invoked to delete list of stale indices, send back the response immediately and do the snapshot deletion of stale indices in background (Using SNAPSHOT threadpool workers ).

With above Approach2 time taken to delete a snapshot for similar cluster with 3 masters of type r5.12xlarge and 50 data nodes of type i3.4xlarge with 1601 indices and 8001 shards and 4.8 TB data. IT takes approximately only 8 seconds to delete such snapshot.

	Shards #	Indices #	Snapshot Creation Time(Avg)	Snapshot Deletion Time(Avg)
Approach2 Optimization	8001	1601	6.8 min	8 sec
Current Delete	8001	1601	8.4 min	31.1 min

In Approach2 if because of master node failure, if deletion of stale indices fails , then these stale indices will be deleted in next snapshot deletion iteration , as these are stale indices and are not been referenced by any snapshots.

To track the progress of stale indices cleanup in background, a new status can be added in cluster state (Open for suggestion on how we can track the progress of stale indices cleanup)

**Approach2 Optimization flow diagram of cleanupStaleIndices : **

I want to take feedback from community on above 2 snapshot deletion optimization approaches before raising PR .

The text was updated successfully, but these errors were encountered:

elasticmachine · 2020-08-25T17:18:42Z

Pinging @elastic/es-distributed (:Distributed/Snapshot/Restore)

original-brownbear · 2020-08-25T17:24:06Z

Thanks for raising this @piyushdaftary The fact that we only execute this single-threaded really isn't all that optimal I agree and can lead to some very suboptimal situations as in your example.
I would also agree that parallelising this across indices on the SNAPSHOT pool is a reasonable fix. We should go with option 1 here and not push the execution in the background (I in fact suggested doing that in the past and we rejected the idea eventually for various user experience reasons). If you want, feel free to give option 1 a go :) If not I'm happy to implement this quickly as well next week, it should be a fairly simple change I think.

piyushdaftary · 2020-08-26T06:52:34Z

Thanks @original-brownbear . I will raise the PR with implementation of Approach1.

Fixes elastic#61513

The delete snapshot task takes longer than expected. A major reason for this is that the (often many) stale indices are deleted iteratively. In this commit we change the deletion to be concurrent using the SNAPSHOT threadpool. Notice that in order to avoid putting too many delete tasks on the threadpool queue a similar methodology was used as in `executeOneFileSnapshot()`. This is due to the fact that the threadpool should allow other tasks to use this threadpool without too much of a delay. fixes issue elastic#61513 from Elasticsearch project

ls-ivan-kiselev · 2021-12-28T09:25:48Z

Ugh, I suffer from this slowness so much right now, thanks for raising it!

I have a snapshot repo to clean up of 3 years of snapshots every 2 hours and well so far it goes with a speed of 2 snapshots a day.

After deleting a snapshot today we clean up all the now-dangling indices sequentially, which can be rather slow. With this commit we parallelize the work across the whole `SNAPSHOT` pool on the master node. Closes elastic#61513 Co-authored-by: Piyush Daftary <[email protected]>

piyushdaftary added >bug needs:triage Requires assignment of a team area label labels Aug 25, 2020

original-brownbear self-assigned this Aug 25, 2020

original-brownbear added the :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs label Aug 25, 2020

elasticmachine added the Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. label Aug 25, 2020

original-brownbear removed the needs:triage Requires assignment of a team area label label Aug 25, 2020

piyushdaftary added a commit to piyushdaftary/elasticsearch that referenced this issue Nov 3, 2020

Speedup snapshot stale indices delete

2dfbb9c

Fixes elastic#61513

piyushdaftary mentioned this issue Nov 3, 2020

Speedup snapshot stale indices delete #64513

Closed

AmiStrn mentioned this issue Feb 18, 2021

Make snapshot deletion faster AmiStrn/elasticsearch#1

Closed

AmiStrn mentioned this issue Feb 18, 2021

Make snapshot deletion faster AmiStrn/elasticsearch#2

Closed

nknize mentioned this issue Feb 26, 2021

[Duplicate] Make snapshot deletion faster opensearch-project/OpenSearch#146

Closed

piyushdaftary mentioned this issue Apr 26, 2021

Speedup snapshot stale indices delete opensearch-project/OpenSearch#613

Merged

DaveCTurner mentioned this issue Oct 5, 2023

Parallelize stale index deletion #100316

Merged

DaveCTurner closed this as completed in cadcb9b Oct 9, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make snapshot deletion faster #61513

Make snapshot deletion faster #61513

piyushdaftary commented Aug 25, 2020 •

edited

Loading

elasticmachine commented Aug 25, 2020

original-brownbear commented Aug 25, 2020

piyushdaftary commented Aug 26, 2020

ls-ivan-kiselev commented Dec 28, 2021

Make snapshot deletion faster #61513

Make snapshot deletion faster #61513

Comments

piyushdaftary commented Aug 25, 2020 • edited Loading

Suggested Optimizations

Approach1 :

Approach2 :

elasticmachine commented Aug 25, 2020

original-brownbear commented Aug 25, 2020

piyushdaftary commented Aug 26, 2020

ls-ivan-kiselev commented Dec 28, 2021

piyushdaftary commented Aug 25, 2020 •

edited

Loading