-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove S3 output stream #27280
Remove S3 output stream #27280
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM from the snapshot/restore perspective. Love the simplification. Left one minor observation. Feel free to ignore it if you disagree.
Streams.copy(inputStream, stream); | ||
} | ||
|
||
final UploadMethod method = (blobSize <= blobStore.bufferSizeInBytes()) ? this::executeSingleUpload : this::executeMultipartUpload; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I fell like a functional interface here doesn't really buy us anything comparing to a simple if statement with two calls.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree and pushed 4421ee1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Left one minor nit.
@@ -87,7 +87,7 @@ | |||
* use of the Multipart API and may result in upload errors. Defaults to the minimum between 100MB and 5% of the heap size. | |||
*/ | |||
static final Setting<ByteSizeValue> BUFFER_SIZE_SETTING = Setting.byteSizeSetting("buffer_size", DEFAULT_BUFFER_SIZE, | |||
new ByteSizeValue(5, ByteSizeUnit.MB), new ByteSizeValue(5, ByteSizeUnit.TB)); | |||
new ByteSizeValue(5, ByteSizeUnit.MB), new ByteSizeValue(5, ByteSizeUnit.GB)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you extract these as contants and then use these instead of redefining them in S3BlobContainer? Can you also document the limitations of S3 in the Javadoc here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pushed 4aae0ce
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Thanks :)
4aae0ce
to
cc3c690
Compare
Now the blob size information is available before writing anything, the repository implementation can know upfront what will be the more suitable API to upload the blob to S3. This commit removes the DefaultS3OutputStream and S3OutputStream classes and moves the implementation of the upload logic directly in the S3BlobContainer. related #26993 closes #26969
Now the blob size information is available before writing anything, the repository implementation can know upfront what will be the more suitable API to upload the blob to S3. This commit removes the DefaultS3OutputStream and S3OutputStream classes and moves the implementation of the upload logic directly in the S3BlobContainer. related #26993 closes #26969
* es/master: (24 commits) Reduce synchronization on field data cache add json-processor support for non-map json types (#27335) Properly format IndexGraveyard deletion date as date (#27362) Upgrade AWS SDK Jackson Databind to 2.6.7.1 Stop responding to ping requests before master abdication (#27329) [Test] Fix POI version in packaging tests Allow affix settings to specify dependencies (#27161) Tests: Improve size regex in documentation test (#26879) reword comment Remove unnecessary logger creation for doc values field data [Geo] Decouple geojson parse logic from ShapeBuilders [DOCS] Fixed link to docker content Plugins: Add versionless alias to all security policy codebase properties (#26756) [Test] #27342 Fix SearchRequests#testValidate [DOCS] Move X-Pack-specific Docker content (#27333) Fail queries with scroll that explicitely set request_cache (#27342) [Test] Fix S3BlobStoreContainerTests.testNumberOfMultiparts() Set minimum_master_nodes to all nodes for REST tests (#27344) [Tests] Relax allowed delta in extended_stats aggregation (#27171) Remove S3 output stream (#27280) ...
Now the blob size information is available before writing anything, the repository implementation can know upfront what will be the more suitable API to upload the blob to S3. This commit removes the DefaultS3OutputStream and S3OutputStream classes and moves the implementation of the upload logic directly in the S3BlobContainer. related #26993 closes #26969
This change has been backported to 5.6.5, 6.0.1, 6.1.0 and master. |
The
S3OutputStream
class has been added in elasticsearch 1.4 in order to take advantage of the AWS Multipart Upload API. At that time, the Snapshot/Restore API didn't communicate the size of the blobs to be written to the repository implementation: the logic was to open anOutputstream
and start to write bytes to it. With this logic, we decided to bufferize the bytes in memory until we know which API to use between a single upload request or multipart upload requests. This is why the buffer and associated logic was implemented as anOutputStream
.Now the blob size information is available before writing anything, the repository implementation can know upfront what will be the more suitable API to upload the blob to S3.
This pull request removes the
DefaultS3OutputStream
andS3OutputStream
classes and moves the implementation of the upload logic directly in theS3BlobContainer
. I think it's easier to understand and easier to maintain. It also avoids the internal buffering by passing theInputStream
directly to the S3 client that takes care of buffering (up to 16Mb) and retry the requests. It removes pressure on memory (some allocations outside TLAB drop from 2.7Gb to 57Mb in my tests while total snapshot time drops from 35%), specially on nodes with small heap like 1Gb-4Gb.Note: this has been tested together with #27278.