Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Ability to Write a BytesReference to BlobContainer #66501

Merged

Conversation

original-brownbear
Copy link
Member

Except when writing actual segment files to the blob store
we always write BytesReference instead of a stream.
Only having the stream API available forces needless copies
on us. I fixed the straight-forward needless copying for
HDFS and FS repos in this PR, we could do similar fixes for
GCS and Azure as well and thus significantly reduce the peak
memory use of these writes on master nodes in particular.

Except when writing actual segment files to the blob store
we always write `BytesReference` instead of a stream.
Only having the stream API available forces needless copies
on us. I fixed the straight-forward needless copying for
HDFS and FS repos in this PR, we could do similar fixes for
GCS and Azure as well and thus significantly reduce the peak
memory use of these writes on master nodes in particular.
@fcofdez fcofdez self-requested a review December 17, 2020 10:36
@original-brownbear
Copy link
Member Author

Jenkins run elasticsearch-ci/1 (unrelated + known)

@original-brownbear
Copy link
Member Author

@fcofdez hold off on the review for a sec (or ignore HDFS for now), I made a small mistake there with the write flags and it fails tests now. I'll fix it once I'm back from my lunch break :)

@original-brownbear
Copy link
Member Author

@fcofdez fixed my random oversight :) should be good to review now.

Copy link
Contributor

@fcofdez fcofdez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the change! LGTM

@original-brownbear
Copy link
Member Author

Thanks Francisco!

@original-brownbear original-brownbear merged commit 3819fcb into elastic:master Dec 17, 2020
@original-brownbear original-brownbear deleted the write-blob-bytesreference branch December 17, 2020 16:42
original-brownbear added a commit to original-brownbear/elasticsearch that referenced this pull request Dec 17, 2020
Except when writing actual segment files to the blob store
we always write `BytesReference` instead of a stream.
Only having the stream API available forces needless copies
on us. I fixed the straight-forward needless copying for
HDFS and FS repos in this PR, we could do similar fixes for
GCS and Azure as well and thus significantly reduce the peak
memory use of these writes on master nodes in particular.
original-brownbear added a commit that referenced this pull request Dec 18, 2020
Except when writing actual segment files to the blob store
we always write `BytesReference` instead of a stream.
Only having the stream API available forces needless copies
on us. I fixed the straight-forward needless copying for
HDFS and FS repos in this PR, we could do similar fixes for
GCS and Azure as well and thus significantly reduce the peak
memory use of these writes on master nodes in particular.
@original-brownbear original-brownbear restored the write-blob-bytesreference branch January 4, 2021 01:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants