Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize atomic writes in BlobContainer for Cloud providers #30680

Closed
ywelsch opened this issue May 17, 2018 · 2 comments · Fixed by #31100
Closed

Optimize atomic writes in BlobContainer for Cloud providers #30680

ywelsch opened this issue May 17, 2018 · 2 comments · Fixed by #31100
Assignees
Labels
:Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs >enhancement

Comments

@ywelsch
Copy link
Contributor

ywelsch commented May 17, 2018

Certain repository implementations (e.g. S3) provide all-or-nothing semantics when putting files. The snapshot/restore implementation currently has its own way of achieving atomic writes for certain files, by writing to a temporary filename first, and then moving the file using an atomic rename (supported by some filesystems). The cloud providers don't offer a rename operation, and our Cloud provider implementations achieve this by copying the file, and then deleting the original file. This seems wasteful, as it results in more writes, without providing any additional guarantees. I therefore suggest to introduce an "atomicWriteBlob" method directly on BlobContainer that can be implemented in different ways then by the specific implementations, and use the atomic rename trick for FSBlobStore while just delegating to "writeBlob" for Cloud providers.

@ywelsch ywelsch added >enhancement :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs labels May 17, 2018
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed

@tlrx
Copy link
Member

tlrx commented May 17, 2018

I agree, and the copy+delete operation also adds extra burden in tests.

tlrx added a commit to tlrx/elasticsearch that referenced this issue Jun 5, 2018
This commit adds a new writeBlobAtomic() method to the BlobContainer
interface that can be implemented by repository implementations which
support atomic writes operations.

When the repository does not support atomic writes, this method just
delegate the write operation to the usual writeBlob() method.

Related to elastic#30680
tlrx added a commit that referenced this issue Jun 5, 2018
This commit adds a new writeBlobAtomic() method to the BlobContainer
interface that can be implemented by repository implementations which
support atomic writes operations.

When the BlobContainer implementation does not provide a specific 
implementation of writeBlobAtomic(), then the writeBlob() method is used.

Related to #30680
tlrx added a commit that referenced this issue Jun 5, 2018
This commit adds a new writeBlobAtomic() method to the BlobContainer
interface that can be implemented by repository implementations which
support atomic writes operations.

When the BlobContainer implementation does not provide a specific 
implementation of writeBlobAtomic(), then the writeBlob() method is used.

Related to #30680
tlrx added a commit to tlrx/elasticsearch that referenced this issue Jun 5, 2018
tlrx added a commit that referenced this issue Jun 7, 2018
tlrx added a commit that referenced this issue Jun 7, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs >enhancement
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants