[Feature Proposal] Enhancement of Repository Plugin #6354

ashking94 · 2023-02-17T07:00:22Z

The purpose of this issue is to gather community feedback on proposal of enhancing the repository plugin.

OpenSearch repository plugin, today, provides transfer capabilities via streams to remote store. This plugin allows user to store the index data in off-cluster external repositories such as Amazon S3, Google Cloud Storage, or a shared filesystem, in addition to the local disk of the OpenSearch cluster. By using the repository plugin, OpenSearch users can take advantage of Snapshot feature to backup and restore to protect against data loss, enable disaster recovery, and create replicas of their data for testing and development purposes. With remote-backed storage, the user now has the ability to protect against data loss by automatically creating continuous backups of all index transactions and sending them to remote storage. OpenSearch users can achieve request level durability using remote-backed storage.

Problem Statement

OpenSearch repository plugin today provides interfaces such as writeBlob interface to facilitate transfer of a file by using a single InputStream. This means that the file referenced by InputStream needs to be processed serially. This restricts the capabilities of underlying plugin to serially transfer buffered content of a file and only after successful processing of first buffer, subsequent buffer of content is read and transferred. Parallel processing of multiple parts of a file is therefore, not possible. Use cases such as download or upload of multiple parts of a file in parallel cannot be supported due to this.

S3 repository plugin, for instance, provides support for multi-part upload but due to single InputStream restriction of base plugin, upload of each part happens serially even though S3 provides support for parallel upload of individual parts of a file.

Proposed Solution

We propose to enhance existing repository plugin to extend support for mutliple stream suppliers for underlying vendor plugins to be able to optionally provide multi-stream based implementations for remote transfers. Provisioning multiple streams instead of abstracting with file based transfer would provide control in core Opensearch code to pre-process buffered content with multiple stream wrappers after content is read and before transfer can take place. Stream suppliers instead of concrete streams can further facilitate delegation of stream creation till remote transfer is started. Following can be some of the abstractions we propose to provide the required support :

To check if upload blob is supported.
Upload blob supplied with upload context which can consist of ordered collection of stream suppliers along with metadata of each stream like length, headers, etc.
To check if download blob is supported.
Download blob supplied with download context which can consist of ordered collection of stream appliers to be applied on top of sdk input stream before persisting data on disk. Each applier can have metadata needed for applying, associated with it.

Credits - @vikasvb90, @ashking94

ashking94 · 2023-02-17T07:09:47Z

Tagging @elfisher @muralikpbhat @reta @mch2 @dreamer-89 @andrross @Bukhtawar @sachinpkale @itiyamas for feedback. Pls do tag others who can review this.

andrross · 2023-02-28T22:27:25Z

This proposal looks like a good idea to me. I suggest putting together a quick proof-of-concept of the interface changes you have in mind, as I suspect it'll be easier for folks to provide feedback on a more concrete proposal.

It would also be great to have at least rough implementations from multiple object stores to ensure the API has the right abstractions (though what you're proposing does look pretty generic).

saratvemulapalli · 2023-03-02T23:28:41Z

This is proposal looks great. My thumb rule is always, if we are making the product better its a go.

From what I understand, we are improving the upload (or download) to remote store using multipart[1]. It absolutely makes sense as it improves the performance and reliability dramatically.

The tricker part is to convert the single InputStream to multiple parts which I believe the proposal aims to solve. Let OpenSearch manage this and provide an interface to split into multiple ~~streams~~ parts.
From quick search on internet, looks like java doesnt have something out of the box and we should write up these utilities for every BlobContainer[2] implementations.

Also as an optimization, multipart upload/download only makes sense for large files (>100MB recommended for S3) we should gate this and make sure we dont overkill for small files.

Overall it makes sense, I second @andrross idea, lets implement this for one blob store and we could converse right set of abstractions.

[1] https://docs.aws.amazon.com/AmazonS3/latest/userguide/mpuoverview.html
[2] https://github.com/opensearch-project/OpenSearch/blob/main/server/src/main/java/org/opensearch/common/blobstore/BlobContainer.java

vikasvb90 · 2023-03-03T00:03:24Z

@saratvemulapalli Your overall brief of using multiple streams emitted from a file and using them for transfer instead of a single stream is correct. There are few corrections though in detailed view :

Since InputStream is an interface and underlying concrete streams define how content should be processed, it cannot be used to split the stream. Idea is to treat the file as a composition of multiple parts and each part to be referenced by a single stream.
Repository plugin will only deal with handling transfer via provided multiple streams. Splitting of the file into multiple streams will still be the responsibility of caller (Opensearch in this case). By keeping this in caller, we allow it to add more processing wrapper streams such as encrypting streams.

We will add details in the design doc.

saratvemulapalli · 2023-03-03T00:09:24Z

@vikasvb90 thank you. That makes sense.

reta · 2023-03-06T17:26:49Z

I think the proposal make sense, but I would love to see the API sketched, as @andrross suggested. The repository plugin is pretty generic in nature, there are feature disparities between S3 / Azure / HDFS / GCS / OCS / whatever comes next and this needs to be baked into API concisely.

dblock · 2023-03-08T23:45:31Z

This is a great proposal!

Is there a world where we don't want to do anything other than multipart/parallel uploads when it comes to S3?
Is there a way to support multipart upload with no API changes at all, just change the implementation of the S3 repository plugin (e.g. spawn N threads, each writes parts), even if only POC?
Is it possible/does it make sense to write parts out of order?

andrross · 2023-03-09T00:40:31Z

Is there a way to support multipart upload with no API changes at all, just change the implementation of the S3 repository plugin (e.g. spawn N threads, each writes parts), even if only POC?

The limiting factor, as I understand it, is that the interface passes in a single InputStream for one file. Given that an InputStream only gives sequential access to a file, the only way to do concurrent multipart uploads without changing the interface API would be to buffer the parts into memory from that InputStream. That could be done for a proof-of-concept, but I'm not sure how valuable that would be given that the memory requirements would likely be unacceptable in the general case. I think a proof-of-concept including the interface changes would be pretty straightforward.

vikasvb90 · 2023-03-09T05:56:10Z

@dblock

There are other operations but I am assuming that you are only referring to uploads. In case of uploads, multipart parallel uploads are certainly more efficient over serial upload but there are some factors based on which we would want to limit concurrent multipart uploads or switch to serial upload such as additional CPU overhead, size of the file, etc. But these decision can be left to be taken in vendor plugin.
Existing interface only allows single InputStream to be passed and to convert single stream into multiple streams we would need to read the content twice which would be inefficient.
Parts will be written in isolation with each other and therefore, can be written out of order and it will be the responsibility of cloud platform to make sure that support for combining all parts into a single file is present. For instance, S3 automatically aggregates all parts into one file after the upload using part number present against each part. If such support is not present in the cloud provider then vendor plugin may also choose to keep parts as separate files on remote store and accordingly adjust its read and write flows.

dblock · 2023-03-13T14:16:10Z

Even with a single InputStream is the limiting factor the reading of the stream or the writing of the data? Writing the contents of a the single input stream concurrently to S3 may be a worthy experiment without any interface changes.

vikasvb90 · 2023-03-15T15:54:07Z

@dblock
In case of S3, minimum part size of a file can be 5MB. That's why a part is not uploaded as a single request but multiple http requests of smaller buffer size of default ~10KB are created and executed. This means that if we enable concurrent writes from a single stream then this will result in random data present in a part which might not belong that part. On completion of the upload when S3 will try to combine all parts sequentially, the resultant file will contain shuffled data.

I am not sure if it is also the case in other remote stores but in case of S3 it will surely not work. This means that we need one stream per part referring to a particular portion of a file responsible for uploading only that portion.

ashking94 added enhancement Enhancement or improvement to existing feature or request discuss Issues intended to help drive brainstorming and decision making feature New feature or request labels Feb 17, 2023

ashking94 self-assigned this Feb 17, 2023

github-actions bot added the untriaged label Feb 17, 2023

ashking94 removed the untriaged label Feb 17, 2023

ashking94 mentioned this issue Feb 21, 2023

[DOC] Create documentation for Enhancement of Repository Plugin opensearch-project/documentation-website#2990

Closed

4 tasks

vikasvb90 mentioned this issue Feb 21, 2023

[DOC] Enhancement of Repository plugin opensearch-project/documentation-website#2994

Open

4 tasks

vikasvb90 mentioned this issue Mar 11, 2023

[Design Proposal] Enhancement of Repository Plugin #6632

Closed

vikasvb90 mentioned this issue Mar 16, 2023

[META] Enhancement of Repository Plugin #6703

Closed

6 tasks

vikasvb90 assigned vikasvb90 and unassigned ashking94 Mar 28, 2023

vikasvb90 closed this as completed May 23, 2023

This was referenced Jul 10, 2023

[Repository] Support multi-stream downloads within Repository #8596

Closed

[POC] Support multi-stream downloads within Repository #8729

Closed

Bukhtawar mentioned this issue Jun 27, 2024

Add Ashish Singh as maintainer #14567

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature Proposal] Enhancement of Repository Plugin #6354

[Feature Proposal] Enhancement of Repository Plugin #6354

ashking94 commented Feb 17, 2023

ashking94 commented Feb 17, 2023 •

edited

Loading

andrross commented Feb 28, 2023

saratvemulapalli commented Mar 2, 2023 •

edited

Loading

vikasvb90 commented Mar 3, 2023 •

edited

Loading

saratvemulapalli commented Mar 3, 2023

reta commented Mar 6, 2023

dblock commented Mar 8, 2023 •

edited

Loading

andrross commented Mar 9, 2023

vikasvb90 commented Mar 9, 2023 •

edited

Loading

dblock commented Mar 13, 2023

vikasvb90 commented Mar 15, 2023

[Feature Proposal] Enhancement of Repository Plugin #6354

[Feature Proposal] Enhancement of Repository Plugin #6354

Comments

ashking94 commented Feb 17, 2023

Problem Statement

Proposed Solution

ashking94 commented Feb 17, 2023 • edited Loading

andrross commented Feb 28, 2023

saratvemulapalli commented Mar 2, 2023 • edited Loading

vikasvb90 commented Mar 3, 2023 • edited Loading

saratvemulapalli commented Mar 3, 2023

reta commented Mar 6, 2023

dblock commented Mar 8, 2023 • edited Loading

andrross commented Mar 9, 2023

vikasvb90 commented Mar 9, 2023 • edited Loading

dblock commented Mar 13, 2023

vikasvb90 commented Mar 15, 2023

ashking94 commented Feb 17, 2023 •

edited

Loading

saratvemulapalli commented Mar 2, 2023 •

edited

Loading

vikasvb90 commented Mar 3, 2023 •

edited

Loading

dblock commented Mar 8, 2023 •

edited

Loading

vikasvb90 commented Mar 9, 2023 •

edited

Loading