Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update documentation for std::fs::read_to_string to reflect performan… #130610

Closed
wants to merge 1 commit into from

Conversation

fiveseven-lambda
Copy link

…ce improvements on Windows

Closes #130600.

This pull request updates the documentation for std::fs::read_to_string to clarify its performance advantages on Windows compared to the combination of std::fs::File::open and std::io::Read::read_to_string.

In PR #110655, optimizations were introduced to improve the performance of std::fs::read_to_string on Windows. However, the current documentation does not reflect these improvements, potentially misleading users into thinking that using File::open and Read::read_to_string would have the same performance characteristics.

The updated documentation now explicitly mentions that std::fs::read_to_string offers better performance on Windows than manually opening the file and using std::io::Read::read_to_string.

r? workingjubilee

@rustbot
Copy link
Collaborator

rustbot commented Sep 20, 2024

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @workingjubilee (or someone else) some time within the next two weeks.

Please see the contribution instructions for more information. Namely, in order to ensure the minimum review times lag, PR authors and assigned reviewers should ensure that the review label (S-waiting-on-review and S-waiting-on-author) stays updated, invoking these commands when appropriate:

  • @rustbot author: the review is finished, PR author should check the comments and take action accordingly
  • @rustbot review: the author is ready for a review, this PR will be queued again in the reviewer's queue

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Sep 20, 2024
@the8472
Copy link
Member

the8472 commented Sep 20, 2024

potentially misleading users into thinking that using File::open and Read::read_to_string would have the same performance characteristics.

Considering the buffers use exponential growing the allocation behavior should hardly make a difference. I don't think it's worth pointing this out in the documentation (guaranteeing any implementation details) unless the advantage is tremendous.

Do you have any numbers that show that it's worth steering users in this direction and tying our hands regarding future implementation changes?

See also #130600 (comment)

@workingjubilee
Copy link
Member

Considering the buffers use exponential growing the allocation behavior should hardly make a difference.

? Er, do they? If you use with_capacity you get an allocation of that size if you allocate a sufficiently large allocation, because there's no benefit to fitting a large allocation of multiple megabytes to the next exponential size category.

@the8472
Copy link
Member

the8472 commented Sep 20, 2024

It's O(1) vs. O(log(n)) allocations, which sometimes makes a difference, but when you're doing IO then I'd expect the IO overhead to dwarf the allocation overhead.

The changes were motivated not by allocation behavior but by repeated initialization costs in the windows kernel (#110650).

I could be wrong. Perhaps the default allocator on windows is slow enough that it's noticeable. That's why I asked for numbers

@fiveseven-lambda
Copy link
Author

I see... as you say, it doesn't seem like the performance difference is as significant as I initially thought, especially since IO tends to be the bottleneck in such cases. I raised this issue just because I was surprised by the source code and the issue #110650 when I searched about read_to_string to use in my own project. So I don’t have specific numbers...

@fiveseven-lambda
Copy link
Author

@ducktherapy (who raised issue #110650) had posted a related question on reddit and StackOverflow.
https://stackoverflow.com/questions/76055481/why-does-fileread-to-end-get-slower-the-larger-the-buffer-capacity
They mentioned observing "a huge (50x+ on this machine) performance difference" with the cost increasing linearly.

@the8472
Copy link
Member

the8472 commented Sep 21, 2024

That was before the fix.

Comment on lines 273 to +275
/// This is a convenience function for using [`File::open`] and [`read_to_string`]
/// with fewer imports and without an intermediate variable.
/// with fewer imports, without an intermediate variable, and with improved performance
/// by referencing the file's metadata to preallocate buffer size.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still game to accept this but it's a "should have better performance" or "may be faster and more efficient" rather than "is definitely going to have better performance for (specific implementation detail)" because of OS-specific differences.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify, which part are you referring to by saying "may be faster and more efficient"?

Cf. #130600 (comment)

The only optimization that fs::read_to_string has is that it can use the size from the file metadata directly and doesn't need to retrieve the current seek position of the file handle.

@workingjubilee
Copy link
Member

@rustbot author

@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Sep 26, 2024
@alex-semenyuk
Copy link
Member

@fiveseven-lambda
From wg-triage. Is it ready for review?

@fiveseven-lambda
Copy link
Author

I intended to close the PR but forgot to do so. I’ll go ahead and close it now. I am sorry for any inconvenience caused.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Clarification on performance improvements in std::fs::read_to_string for Windows
5 participants