-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid expensive GetFileInformationByHandleEx syscall if possible #49541
Comments
One more idea for optimization that could be useful for small files: when the user opens the file for reading and is about the perform first |
Backing up for a moment, what happens if we don't use the Length at all? e.g. this code: runtime/src/libraries/System.Private.CoreLib/src/System/IO/LegacyFileStreamStrategy.Windows.cs Lines 837 to 852 in bcd07c5
tries to limit a read to fit within the file's size... what happens if we don't do that? |
In case the buffer is bigger than what is left in the file, we slice the buffer: runtime/src/libraries/System.Private.CoreLib/src/System/IO/LegacyFileStreamStrategy.Windows.cs Lines 844 to 847 in bcd07c5
and then we can safely move the file offset by runtime/src/libraries/System.Private.CoreLib/src/System/IO/LegacyFileStreamStrategy.Windows.cs Line 868 in bcd07c5
If we don't perform this check, we might set the offset to a value that is FileStream fs = new FileStream($path, $readAndWrite);
fs.SetLength(100);
byte[] bytes = new byte[200];
await fs.ReadAsync(bytes, 0, 200);
// if the above sets Position to 100, we are safe.
// What happens if it sets it to 200? Is the next write going to create a sparse file?
await fs.WriteAsync(bytes, 0, 200); We could ofc update the position once we receive the number of retrieved bytes: runtime/src/libraries/System.Private.CoreLib/src/System/IO/FileStreamCompletionSource.Win32.cs Line 127 in bcd07c5
But it would work well only when users would be awaiting the operations. |
Right. It seems then this is an attempt to support concurrent operations on the FileStream? And if so, in the face of such concurrent operations does the right thing actually happen? What happens if there's an asynchronous error during the read, for example? |
Yes! And I am not sure if OS will always return cc @JeremyKuhne |
Before we throw the exception we try to reset the file offset to the current offset: runtime/src/libraries/System.Private.CoreLib/src/System/IO/LegacyFileStreamStrategy.Windows.cs Lines 899 to 902 in bcd07c5
Which I suspect would be the most recent value that we have passed to FileStream fs = new FileStream($path, $readAndWrite);
fs.SetLength(3);
byte[] bytes1 = new byte[1], bytes2 = new byte[1], bytes3 = new byte[1];
Task[] reads = new Task[] { fs.ReadAsync(bytes1, 0, 1), fs.ReadAsync(bytes2, 0, 1), fs.ReadAsync(bytes3, 0, 1) };
// if reads[1] fails the fs.Position is set to ? |
That code is handling if the operation fails synchronously. |
Good point. So the actual error handling happens here: runtime/src/libraries/System.Private.CoreLib/src/System/IO/FileStreamCompletionSource.Win32.cs Lines 188 to 190 in bcd07c5
and does not modify the offset? |
That appears to be the case. It's also not clear what should happen in that case. The whole notion of trying to allow for concurrent operations while also tracking position is fairly flawed, in particular if the operation might fail or return less than was requested. |
I totally agree. But what options do we have?
Personally, I would go with (3) as it would simplify the implementation a lot and it would be always correct (update offset after read|write async has finished with the actual number of bytes). Please keep in mind that I have very little experience with introducing breaking changes to .NET. @stephentoub @jozkee @carlossanlop what do you think? |
What specifically breaks if we just stop clipping read sizes to file length? A developer issuing concurrent reads not only has the workaround of falling back to legacy, they also can just do the clipping themselves if it matters, yes? |
If we don't perform this check, we might set the offset to a value that is FileStream fs = new FileStream($path, $readAndWrite);
fs.SetLength(100);
byte[] bytes = new byte[200];
await fs.ReadAsync(bytes, 0, 200);
// if the above sets Position to 100, we are safe.
// What happens if it sets it to 200? Is the next write going to create a sparse file?
await fs.WriteAsync(bytes, 0, 200); @jozkee have you tried to see what happens?
Yes, but I expect that most of the users don't perform any clipping as |
Yes. But it wouldn't negatively affect them. It would only affect them if they then turned around and expected to be able to concurrently issue a write at the position reported before the read completed. I really wonder if we're making this too hard on ourselves. |
To solve #16354 we are going to track the file offset in memory and avoid expensive
Seek
calls (draft 91423fa):runtime/src/libraries/System.Private.CoreLib/src/System/IO/LegacyFileStreamStrategy.Windows.cs
Line 868 in bcd07c5
runtime/src/libraries/System.Private.CoreLib/src/System/IO/LegacyFileStreamStrategy.Windows.cs
Line 1073 in bcd07c5
This will allow for removing another expensive
SetLength
call and solve #25905 (draft a7ca4cb):runtime/src/libraries/System.Private.CoreLib/src/System/IO/LegacyFileStreamStrategy.Windows.cs
Line 1062 in bcd07c5
To reach optimal syscall usage for async File IO and get 100% async IO on Windows we just need to get rid of calls to
GetFileInformationByHandleEx
(called byFileStream.Length
which looks very innocent in the source code ;) ):runtime/src/libraries/System.Private.CoreLib/src/System/IO/LegacyFileStreamStrategy.Windows.cs
Line 837 in bcd07c5
From my initial observations, it seems that we should be able to cache the
Length
. The reasoning is that by default files are opened withFileShare.Read
property, which means "allow other processes to just read from the file":runtime/src/libraries/System.Private.CoreLib/src/System/IO/FileStream.cs
Line 15 in bcd07c5
This should allow us to cache the
Length
(as long as the user does not specifyFileShare.Write
in an explicit way which is very rare) and invalidate it on our own when:Position = x
orSeek(x)
that extends or shrinks the fileSetLength(x)
that extends or shrinks the fileThe text was updated successfully, but these errors were encountered: