Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Small cleanup in BlobStoreRepository#finalizeSnapshot #99635

Conversation

DaveCTurner
Copy link
Contributor

Reorders the operations into their logical order and adds a few TODOs
for gaps that this cleanup exposes.

Reorders the operations into their logical order and adds a few TODOs
for gaps that this cleanup exposes.
@DaveCTurner
Copy link
Contributor Author

After going through this during onboarding I think there's room for improvement...

@elasticsearchmachine elasticsearchmachine added the Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. label Sep 18, 2023
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

.<MetadataWriteResult>andThen((l, existingRepositoryData) -> {
final int existingSnapshotCount = existingRepositoryData.getSnapshotIds().size();
if (existingSnapshotCount >= maxSnapshotCount) {
throw new RepositoryException(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NB slight change in exception semantics (see change to BlobStoreSizeLimitIT) - previously this exception would be thrown directly to the outer listener, but now it's wrapped in a SnapshotException first. I claim that the new behaviour is preferable, it's useful to report that this specific snapshot failed, as well as the cause.

Copy link
Member

@ywangd ywangd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

I think the new style of chaining is easier to read than the step based approach.

allMetaListeners.acquire(),
() -> GLOBAL_METADATA_FORMAT.write(clusterMetadata, blobContainer(), snapshotId.getUUID(), compress)
)
// NB failure of writeIndexGen doesn't guarantee the update failed, so we cannot safely clean anything up on failure
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For my learning: Could you please expand a bit on what this means? If the index-N file fails to update, there will be no root level reference to the new shard generation and data files? If so, why is it not safe to clean up?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a good question. I opened #99645 to answer this in the Javadocs so that others can find this info in future too. Please take a look, and if you need more detail then let's discuss there.

@DaveCTurner DaveCTurner merged commit 92458fc into elastic:main Sep 19, 2023
@DaveCTurner DaveCTurner deleted the 2023/09/18/BlobStoreRepository-finalizeSnapshot-cleanup branch September 19, 2023 05:24
@DaveCTurner DaveCTurner restored the 2023/09/18/BlobStoreRepository-finalizeSnapshot-cleanup branch June 17, 2024 06:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs >refactoring Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. v8.11.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants