Skip to content

Commit

Permalink
Rollup merge of #130670 - the8472:read-to-end-heuristics, r=ChrisDenton
Browse files Browse the repository at this point in the history
delay uncapping the max_read_size in File::read_to_end

In #130600 (comment) I realized that we're likely still passing too-large buffers to the OS, at least once at the end.

Previous issues and PRs:
* #110650
* #110655
* #118222

r? ChrisDenton
  • Loading branch information
GuillaumeGomez authored Sep 22, 2024
2 parents cbf2396 + ca1a2a6 commit 82b4177
Showing 1 changed file with 13 additions and 3 deletions.
16 changes: 13 additions & 3 deletions library/std/src/io/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -398,8 +398,7 @@ where
// - avoid passing large buffers to readers that always initialize the free capacity if they perform short reads (#23815, #23820)
// - pass large buffers to readers that do not initialize the spare capacity. this can amortize per-call overheads
// - and finally pass not-too-small and not-too-large buffers to Windows read APIs because they manage to suffer from both problems
// at the same time, i.e. small reads suffer from syscall overhead, all reads incur initialization cost
// proportional to buffer size (#110650)
// at the same time, i.e. small reads suffer from syscall overhead, all reads incur costs proportional to buffer size (#110650)
//
pub(crate) fn default_read_to_end<R: Read + ?Sized>(
r: &mut R,
Expand Down Expand Up @@ -444,6 +443,8 @@ pub(crate) fn default_read_to_end<R: Read + ?Sized>(
}
}

let mut consecutive_short_reads = 0;

loop {
if buf.len() == buf.capacity() && buf.capacity() == start_cap {
// The buffer might be an exact fit. Let's read into a probe buffer
Expand Down Expand Up @@ -489,6 +490,12 @@ pub(crate) fn default_read_to_end<R: Read + ?Sized>(
return Ok(buf.len() - start_len);
}

if bytes_read < buf_len {
consecutive_short_reads += 1;
} else {
consecutive_short_reads = 0;
}

// store how much was initialized but not filled
initialized = unfilled_but_initialized;

Expand All @@ -503,7 +510,10 @@ pub(crate) fn default_read_to_end<R: Read + ?Sized>(
// The reader is returning short reads but it doesn't call ensure_init().
// In that case we no longer need to restrict read sizes to avoid
// initialization costs.
if !was_fully_initialized {
// When reading from disk we usually don't get any short reads except at EOF.
// So we wait for at least 2 short reads before uncapping the read buffer;
// this helps with the Windows issue.
if !was_fully_initialized && consecutive_short_reads > 1 {
max_read_size = usize::MAX;
}

Expand Down

0 comments on commit 82b4177

Please sign in to comment.