Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove Redundant CS Update on Snapshot Finalization #55276

Conversation

original-brownbear
Copy link
Member

@original-brownbear original-brownbear commented Apr 16, 2020

This change folds the removal of the in-progress snapshot entry
into setting the safe repository generation.
Outside of removing an unnecessary cluster state update, this also has the advantage
of removing a somewhat inconsistent cluster state where the safe repository generation points at RepositoryData that contains a finished snapshot while it is still in-progress in the cluster
state, making it easier to reason about the state machine of upcoming concurrent snapshot operations.

Note: We can do the same for snapshot deletion where it is a much more interesting change because most of the work in the delete (deleting all unreferenced blobs) can happen after the new index-N is written and does not require the delete-in-progress entry in the cluster state (assuming the repo is using shard-generations). I just wanted to start with the snapshot finalization since it's much easier to review and has no BwC implications.

This change folds the removal of the in-progress snapshot entry
into setting the safe repository generation. Outside of removing
an unnecessary cluster state update, this also has the advantage
of removing a somewhat inconsistent cluster state where the safe
repository generation points at `RepositoryData` that contains a
finished snapshot while it is still in-progress in the cluster
state, making it easier to reason about the state machine of
upcoming concurrent snapshot operations.
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (:Distributed/Snapshot/Restore)

@@ -189,8 +189,7 @@ public void clusterChanged(ClusterChangedEvent event) {
final RepositoriesMetadata repoMeta =
event.state().metadata().custom(RepositoriesMetadata.TYPE);
final RepositoryMetadata metadata = repoMeta.repository("test-repo");
if (metadata.generation() == metadata.pendingGeneration()
&& metadata.generation() > snapshotEntry.repositoryStateId()) {
if (metadata.pendingGeneration() > snapshotEntry.repositoryStateId()) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The concrete situation that this test was covering is technically gone with this change, but I think it's reasonable to keep testing the very similar situation of master fail-over during the repository side of the finalization here now.

@original-brownbear
Copy link
Member Author

Jenkins run elasticsearch-ci/2 (some ML thing)

Copy link
Contributor

@ywelsch ywelsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've left one comment about naming. I think that technically this change is ok, but I wonder about the separation of SnapshotsService and BlobStoreRepository (which is only one implementation of Repository), where it becomes less and less clear where the responibilities for each class lie.

@@ -1301,10 +1304,11 @@ public boolean isReadOnly() {
* @param repositoryData RepositoryData to write
* @param expectedGen expected repository generation at the start of the operation
* @param writeShardGens whether to write {@link ShardGenerations} to the new {@link RepositoryData} blob
* @param stateFilter filter for the last cluster state update executed by this method
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not a filter, rather a transformation function. Perhaps adapt naming?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed :)

@original-brownbear
Copy link
Member Author

but I wonder about the separation of SnapshotsService and BlobStoreRepository (which is only one implementation of Repository), where it becomes less and less clear where the responibilities for each class lie.

+1 in theory. In practice, we're already exposing a lot of the specific internals of the blob store repository (e.g. shard generations are handled in snapshots-in-progress, repo generation, ...) so that ship seems to have sailed already. Also, practically speaking all our repos are eventually backed by a blob store repository for writing. Maybe it would make more sense to eventually adjust the repository interface to be more centered around just writing blobs and move all the state handling out of it completely. But for now I don't see a short way of enabling more concurrent operations without this kind of mixing of the code.

@original-brownbear
Copy link
Member Author

Jenkins run elasticsearch-ci/1 (unrelated reindex failure)

Copy link
Contributor

@ywelsch ywelsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@original-brownbear
Copy link
Member Author

Thanks Yannick!

@original-brownbear original-brownbear merged commit e9fbfea into elastic:master Apr 21, 2020
@original-brownbear original-brownbear deleted the atomic-snapshot-finalization branch April 21, 2020 12:03
original-brownbear added a commit to original-brownbear/elasticsearch that referenced this pull request Apr 21, 2020
This change folds the removal of the in-progress snapshot entry
into setting the safe repository generation. Outside of removing
an unnecessary cluster state update, this also has the advantage
of removing a somewhat inconsistent cluster state where the safe
repository generation points at `RepositoryData` that contains a
finished snapshot while it is still in-progress in the cluster
state, making it easier to reason about the state machine of
upcoming concurrent snapshot operations.
original-brownbear added a commit that referenced this pull request Apr 21, 2020
This change folds the removal of the in-progress snapshot entry
into setting the safe repository generation. Outside of removing
an unnecessary cluster state update, this also has the advantage
of removing a somewhat inconsistent cluster state where the safe
repository generation points at `RepositoryData` that contains a
finished snapshot while it is still in-progress in the cluster
state, making it easier to reason about the state machine of
upcoming concurrent snapshot operations.
original-brownbear added a commit to original-brownbear/elasticsearch that referenced this pull request Apr 21, 2020
Same as elastic#55276 but for snapshot deletes.
This change folds the removal of the snapshot delete state entry into
the the safe generation step where possible.
This measn that for repositories that write shard generations, the time
the snapshot delte entry will stay in the cluster state will be shortened a lot
and reduced to the time it takes to update the repository metadata.
It is fully safe in this case to run other snapshot operations after the metadata.

We can not do this for repositories that do not write shard generations so those need to
go through a different path and submit a separate state update task still.

Also, this PR fixes a problem with the cooldown period for S3 non-shard-generation repos
introduced by elastic#55286. We can not run the state update outright in the repository because
we enforce the cooldown via the listener wrapping. I fixed this by folding the final
state update into the listener in this case.
original-brownbear added a commit to original-brownbear/elasticsearch that referenced this pull request Apr 24, 2020
We do the delete in three steps here:

1. Get the repository data and find the snapshot ids to delete
2. Put the delete entry in the CS
3. Run the actual delete

The test `testConcurrentSnapshotCreateAndDeleteOther` was failing because
between `1.` and `2.` a full snapshot completed moving the repository generation
ahead by 1 which we chose to fail on because we expect the repository generation from
step `1.` to still be there in step `3.`.
In the past, using the repository generation from step `1.` made sense as a safety measure
because any rapid increase in repository generation could have spelled trouble on eventually
consistent blob stores. Nowadays, it's just needless to fail here though and we can simply
rely on the generation we read from the repository in step `3.` to avoid ever passing a
broken repository generation to the repository when deleting.

NOTE: This exception was always a possibility but became massively more likely due to
improved/faster snapshot finalization via elastic#55276 so it only showed up now.

Closes elastic#55702
@original-brownbear original-brownbear restored the atomic-snapshot-finalization branch August 6, 2020 18:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants