-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for pause/unpause operation in snapshot repository #48493
Comments
Pinging @elastic/es-distributed (:Distributed/Snapshot/Restore) |
Pinging @elastic/es-core-features (:Core/Features/ILM+SLM) |
I would hope that we can implement this in a way that doesn't require blocking all snapshot operations while the key rotation is happening. We are currently looking into allowing parallel operations on the repository (i.e. allowing delete, create to run at the same time) and I'd hope we can somehow fold this into that effort. I think with the recent work in #46250 and upcoming work building on top of it, we should be able to implement key-rotation in parallel to other operations by simply rewriting the data shard by shard and only locking the repo for brief period of time when updating the shard-level metadata.
While a
I think we will have to be smart about this anyway in that we probably want to distribute the key-rotation step (just running this on a single (master-)node means a lot of encrypting and data transfer in and out of that single node) so adding the parallelism to this operation when we're looking into parallelising other operations already anyway seems like the way to go here. |
Thank you @original-brownbear for your comments.
Yes, if we can do it then that would be the preferred way.
I think it is desirable to do the key rotation in parallel and if we can do it then awesome. |
right ... my bad :) Actually now I remember my original idea for this as well from the Google Doc (thanks for refreshing my memory :))
True, but I'd say that's something we can optimize for later. Key rotation is a relatively rare event and how efficient it is in terms of API call counts might not matter too much? I think it's likely the right trade-off to live with some redundant metadata updates but keep the repository fully functional during key-ratation compared to blocking the repository for potentially hours to save a trivial amount of work (logically you could also argue that if you pause deletes to prevent concurrent deletes you'll waste the effort for updating meta-blobs for blobs that you'll delete right after the pause anyway ...). -> I think the steps in the Google doc and in this comment are still a valid approach here |
Thank you @original-brownbear for your inputs. I am closing this issue as I do not see a need for this API anymore. Thank you. |
Currently, we do not have a way to pause/unpause a snapshot repository for some time. This operation would help users in scenarios where we want to explicitly disable snapshot repository for some time and prevent users from initiating operations like snapshot/restore/delete/repository-cleanup.
We are working on adding client-side encrypted snapshots in #41910. As part of the key rotation step, the master key will be updated and then for all the snapshots for that repository we need to decrypt the old metadata (related to encrypted blobs) with the old key, re-encrypt with the new key and update the store.
When this process is in progress, we anticipate an impact on operations like a snapshot, restore, delete snapshots or repository cleanup. We might be able to continue with the snapshot operations by using the newly updated master key but the restore/delete/cleanup operations still can be impacted.
This enhancement if introduced can have an impact on the working of SLM and discussion/extra handling might be required to be considered.
I am opening this issue to discuss possibilities/options. Thank you.
The text was updated successfully, but these errors were encountered: