-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
System.IO.Stream Backlog #58216
Comments
Tagging subscribers to this area: @dotnet/area-system-io Issue DetailsSystem.IO.Stream has been at the center of data processing and flow in .NET since its inception. Even with newer models being introduced related to streams, like pipelines and channels, it continues to be a key component of many important workloads. We've incrementally improved Stream over the years, e.g. with the introduction of new memory/span-based APIs in .NET Core 2.1, but we haven't revisited it to take a holistic look at what's missing for modern workloads and round out the story. We should do so for .NET 7. We should start this effort by investigating:
We already know of some gaps / common requests to be considered:
|
Something different from https://docs.microsoft.com/en-us/dotnet/api/system.io.unmanagedmemorystream? |
Yes I realized that after I posted. Don't remember why I had to do it manually :p |
@stephentoub could these changes also help simplify my stream class too located in this repository? Currently there are a ton of properties which I am forced to override (sometimes I have to keep them the same as the base I also got tiered of manually writing the Dispose functions as well and so I made it to where people can just source generate their Dispose functions for them (related to implementing the Dispose interface), however I think I should eventually implement where they can optionally tell it to generate a finalizer for them as well (with the |
This is the "Making it easier to create custom streams" bullet. |
I am agreeing to all the things described above. What I needed many times:
|
Is the plan to also include a non/less allocating stream like the following proposal: Maybe that is covered in the "Better Pipelines integration" bullet? |
There's no "plan" currently, other to investigate what should be done to formulate the plan of what we want to implement in 7. |
Ok, I understand. I think it would be good to also include the issue mentioned above in the investigation. Thanks! |
For the read helpers, I'm wondering if we can add something like ReadAsync(Memory<Byte> buffer, CancellationToken token, int minBytes); where
With that if we get EOF before minBytes we may throw IOExeption. |
Stream combinators: There could be a var rangeStream = new RangeStream(innerStream, skip: 2, take: 10); It would support both seekable and non-seekable inner streams. Write support could be there for seekable ones (allowing to overwrite a range). A concat stream could concatenate streams. But it could also support a more general form of "segment". A segment could be a stream, but also a public abstract class ConcatStreamSegment : IDisposable
{
public abstract void Dispose();
public abstract long? Length { get; }
public abstract int Read(Span<byte> buffer);
//Plus additional members for async.
} A segment would be read-once forward-only. Having memory-based segments would also provide a way to create a stream from a memory object. And you could mix, for example to prefix a normal stream with a memory-based header. |
Honestly now that I think about the stream stuff, I think the base Stream and Stream related stuff should be replaced with span based APIs. Mainly because they are not only cleaner, easier to set up and use, and easier to maintain. Pipelines and Channels could also replace Streams as well for cases that span based API's might not fit well for as well. Perhaps the base Stream class could be obsoleted, but in the future be changed to an internal abstract class for the public stream classes (or rework them to be a pipeline or channel based class without renaming or changing behavior). The reasoning for this is well, streams are great and all, but there is complexity when you need to overload a ton of members to do stuff a certain way (take a look at ZlibStream for example). So, I get it that many use or like streams but many more like me prefer Span based (if possible), pipelines, or channels to avoid streams entirely if we can help it to save our time with possible issues in the future. So while there is no intention of removing all of the Stream based classes from the runtime, perhaps they could be considered to be reworked into calling into And this goes for MemoryStream, I prefer the more performant version (System.Memory<T>) provided I do not need to worry about not knowing sizes of things that it needs to be used for. |
@AraHaan: I'm not sure how your API suggestion to this would look like. For a lot of things I have also created my own sort of But for me there is one caveat. The stream currently needs to support long as Position and Length. The current |
Twitter survey resultsNumbers in Custom streams:
MemoryStream: Extensions:
Readers:
My comments
long bytesRemaining = largeInputStreamSize;
while (bytesRemaining > 0)
{
long bytesToCopy = Math.Min(MAX_SIZE, bytesRemaining);
using Stream src = new SubStream(largeInputStream, largeInputStreamSize - bytesRemaining, bytesToCopy);
using Stream dst = File.Create(GetPath());
src.CopyTo(dst);
bytesRemaining -= bytesToCopy;
}
Suggested actions for .NET 7:
|
@adamsitnik - Could there be a series of new Also a new "Modern" inheritance base class of
Could we also reconcile the role of |
@adamsitnik Regarding Here's a (simplified) snippet of my code illustrating the use case: using var aes = DataCenter.CreateCipher(options.Key, options.IV);
using var decryptor = aes.CreateDecryptor();
await using (var cryptoStream = new CryptoStream(stream, decryptor, CryptoStreamMode.Read, true))
{
var size = await new DataCenterBinaryReader(cryptoStream).ReadUInt32Async();
await using (var zlibStream = new ZLibStream(cryptoStream, CompressionMode.Decompress, true))
{
var reader = new DataCenterBinaryReader(zlibStream);
await _header.ReadAsync(reader);
await _keys.ReadAsync(reader);
await _attributes.ReadAsync(reader);
await _nodes.ReadAsync(reader);
await _values.ReadAsync(reader);
await _names.ReadAsync(reader);
await _footer.ReadAsync(reader);
if (reader.Progress != size)
throw new InvalidDataException();
}
} |
When developing some IPC program, I am using a shared-memory based stream, which is written from one side and read from another. While the data is backed by memory, it's necessary to occasionally free some memory that has been read already and will never be read again. Because I organize data in frames, this actually happens every time after the reader side finishes reading a frame. There seems to have no standard way to do this in I notice that there has been proposal to wrap non-seekable streams. If it is added, I hope something like this could be added to it, either as new methods on |
Summary
System.IO.Stream has been at the center of data processing and flow in .NET since its inception. Even with newer models being introduced related to streams, like pipelines and channels, it continues to be a key component of many important workloads. We've incrementally improved Stream over the years, e.g. with the introduction of new memory/span-based APIs in .NET Core 2.1, but we haven't revisited it to take a holistic look at what's missing for modern workloads and round out the story.
Investigations
The following investigations can be pursued to inform our designs and approach to the backlog below:
Backlog (roughly in priority order)
The first theme is providing some read-all helpers. These helpers will help address a common scenario that emerged after a .NET 6 breaking change: Partial and zero-byte reads in DeflateStream, GZipStream, and CryptoStream.
The second theme is to provide improved composability of Stream functionality to reduce the number of scenarios where a custom Stream must be implemented; this will be done through wrappers and/or combinators.
Beyond those themes, there are other new APIs that could be added for common scenarios.
The text was updated successfully, but these errors were encountered: