Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BufWriter should buffer up to its capacity #72919

Closed
Lucretiel opened this issue Jun 2, 2020 · 4 comments
Closed

BufWriter should buffer up to its capacity #72919

Lucretiel opened this issue Jun 2, 2020 · 4 comments
Labels
C-enhancement Category: An issue proposing an enhancement or a PR with one. T-libs Relevant to the library team, which will review and decide on the PR/issue.

Comments

@Lucretiel
Copy link
Contributor

if buf.len() >= self.buf.capacity() {

Suppose you had a BufWriter with capacity 4. If you write b"1234" to it, that will be flushed directly to the underlying stream, but if you write b"12" and then b"34", those will be buffered. The unit tests confirm this behavior, but it seems surprising to me.

@jonas-schievink jonas-schievink added C-enhancement Category: An issue proposing an enhancement or a PR with one. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Jun 2, 2020
@Amanieu
Copy link
Member

Amanieu commented Jun 27, 2020

I don't think we make any guarantees on the exact buffering behavior other than "some buffering happens". In this particular case, copying the data to the buffer is pointless since we'll need to flush the buffer on the next write call anyways: it doesn't help to reduce the number of underlying write calls.

@m-ou-se
Copy link
Member

m-ou-se commented Sep 24, 2020

I think the current implementation makes sense. I'd say the goal of a BufWriter to reduce the number of write calls on the underlying stream. Buffering a write call of the full buffer capacity only serves the purpose of using the buffer as much as possible, as the number of write calls (and their sizes) remain the same. That doesn't seem useful to me.

@m-ou-se
Copy link
Member

m-ou-se commented Sep 24, 2020

Maybe instead, writing "12" and then "34" into a BufWriter with a capacity of 4 should flush right after the "34", instead of waiting until the next call. Feels slightly more consistent.

@Lucretiel
Copy link
Contributor Author

I'm actually working on a PR that does something similar to that. The issue you run into, at least for write, is that you have to report either success or an error, and the latter case implies that 0 bytes were written. So if you buffer "34", you have to report Ok(2), which means you don't want to try any additional fallible io operations. However, the story is a bit different for write_all, and I think there's still an opportunity for optimization where, you split incoming slices in order to make them fit in the tail of the local buffer, to reduce the overall number of writes.

In any case, I've been convinced about the reasoning for this, so this issue is satisfied.

bors added a commit to rust-lang-ci/rust that referenced this issue Dec 9, 2020
…ertj

Use is_write_vectored to optimize the write_vectored implementation for BufWriter

In case when the underlying writer does not have an efficient implementation `write_vectored`, the present implementation of
`write_vectored` for `BufWriter` may still forward vectored writes directly to the writer depending on the total length of the data. This misses the advantage of buffering, as the actually written slice may be small.

Provide an alternative code path for the non-vectored case, where the slices passed to `BufWriter` are coalesced in the buffer before being flushed to the underlying writer with plain `write` calls. The buffer is only bypassed if an individual slice's length is at least as large as the buffer.

Remove a FIXME comment referring to rust-lang#72919 as the issue has been closed with an explanation provided.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-enhancement Category: An issue proposing an enhancement or a PR with one. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

4 participants