-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[API Proposal]: make it easier to create custom Stream, TextReader, and TextWriter implementations #64958
Comments
Tagging subscribers to this area: @dotnet/area-system-io Issue DetailsBackground and motivationDeriving confidently from For example, if we derive from
First off, it seems strange that we need to implement Implementing
I understand why things are the way they are (in my understanding it is to preserve compat for people who have overridden these types in the past), but it would be great if we could do better. If this were done in a way that the framework could make use of it, then that would potentially eliminate a lot of fairly redundant code across lots of classes and libraries. API ProposalMy proposal would be to introduce 3 new classes namespace System.IO
{
public abstract class StreamBase : Stream
{
protected abstract int ReadInternal(Span<byte> span);
protected abstract void WriteInternal(ReadOnlySpan<byte> span);
// Overrides abstract Read(byte[], int, int), Write(byte[], int, int) to call ReadInternal/WriteInternal
// Overrides abstract Seek(long, SeekOrigin) to leverage Position and Length
// Overrides Length and SetLength(long) to throw NotSupportedException by default on the premise that
// relatively few streams support these "in the wild" (is this accurrate?)
// Overrides virtual methods from base class so that:
// * All sync methods flow through ReadInternal/WriteInternal
// * All non-Memory async methods flow through the Memory async methods
// * The memory async methods call ReadInternal/WriteInternal/Flush synchronously and return a completed Task
// Need a better name for this. The idea here is to allow framework implementations that were falling back to the
// "call sync method from a background thread" pattern for compat. Only needed if (a) such implementations
// exist and (b) there is interest in them leveraging StreamBase.
protected virtual bool UseThreadPoolForAsyncOverSyncMethods => false;
}
public abstract class TextReaderBase : TextReader
{
// ignoring the details for now; can revisit if there is interest in the overall idea
}
public abstract class TextWriterBase : TextWriter
{
// ignoring the details for now; can revisit if there is interest in the overall idea
}
} API Usageclass MyStream : StreamBase
{
public override bool CanRead => true;
public override bool CanWrite => true;
public override bool CanSeek => false;
public override long Position { get => throw new NotImplementedException(); set => throw new NotImplementedException(); }
public override void Flush() { /* sync flush impl */ }
protected override int ReadInternal(Span<byte> span) { /* sync read impl */ }
protected override void WriteInternal(ReadOnlySpan<byte> span) { /* sync write impl */ }
// optional
public override Task FlushAsync(CancellationToken cancellationToken) { /* async flush impl */ }
public override ValueTask<int> ReadAsync(Memory<byte> buffer, CancellationToken cancellationToken = default) { /* async read impl */ }
public override ValueTask WriteAsync(ReadOnlyMemory<byte> buffer, CancellationToken cancellationToken = default) { /* async write impl */ }
} Alternative DesignsAn alternate path would be to re-route the calls between methods and provide more virtual overrides in the existing base classes. For compat, this would be opt-in behavior via a constructor parameter or virtual property. The upside is that we avoid introducing new types. Another upside is that framework classes could use the new behavior easily and leverage the toggle to swap back to the old behavior whenever they are derived from. The downside is that it may be difficult for consumers to understand how to leverage the new functionality. Some other considerations for the general design:
RisksFragmentation might increase complexity/confusion instead of decreasing it, although docs could help with this.
|
Related: #58216 |
@jozkee they seem pretty similar to me. What do you see as the key differences? Could be interesting to hash those out here and see if we can arrive at one design proposal. |
On the main proposal, one difference is the Template Method Pattern which says that we can have the existing methods doing typical validations for arguments. In the alternative proposals, you are proposing that the old methods funnel into the newer methods but only if certain opt-in flag is specified, which I think is a good idea, but it is less enforceable. On my issue, the alternative design is still about a new Stream class with new abstracts and the idea of having versioning-like semantics as part of the class name i.e: NewStream2 : NewStream : Stream. Maybe we can factor-out the main proposals and keep alternative proposals separate. |
Would it make sense to close this issue in favor of @jozkee's new proposal? |
Background and motivation
Deriving confidently from
Stream
,TextReader
, andTextWriter
today is fairly difficult and it is easy to shoot yourself in the foot performance-wise. There are many derivations of these classes throughout the framework and in supporting libraries; not all of them do the "right" thing all the time (e. g. dotnet/aspnetcore#38210).For example, if we derive from
Stream
we are presented with the following abstract methods to override:First off, it seems strange that we need to implement
Seek
; why can't that have a default implementation based onPosition
and (forSeekOrigin.End
)Length
?Implementing
Read(byte[], int, int)
and/orWrite(byte[], int, int)
is pretty straightforward, but we're immediately left with some performance "traps" that will make our stream perform less well than it could.ReadByte
/WriteByte
allocate a new byte array on every callRead(Span<byte>)
/Write(ReadOnlySpan<byte>)
which seem like the "right" thing to call these days actually copy to or from a rented buffer and call thebyte[]
versions. Same thing with the async versions that useMemory<byte>
.ReadAsync
/WriteAsync
just runs the synchronous method in another thread. It also goes through the oldBeginRead
/BeginWrite
methods which are complicated; not sure if they have more overhead than a task-based async implementation would.FlushAsync()
also just calls the sync method in another thread.ReadAsync
and/orWriteAsync
. If we override just thebyte[], int, int
versions then theMemory
versions still copy and any old code that actually callsBeginRead
/BeginWrite
will still follow the sync-in-another-thread path.TextWriter
/TextReader
have mostly all the same issues, but are even more complex because they have more methods to work with. The fact that these route all methods through the single-char methodsWrite(char)
andRead()
feels strange, but maybe there is a good reason for this.I think I understand why things are the way they are (in my understanding it is to preserve compat for people who have overridden these types in the past), but it would be great if we could do better. If this were done in a way that the framework could make use of it, then that would potentially eliminate a lot of fairly redundant code across lots of classes and libraries.
API Proposal
My proposal would be to introduce 3 new classes
StreamBase
,TextReaderBase
, andTextWriterBase
to provide "modern" alternatives to the existing base classes. These classes would derive from the existing classes but route between their methods in a different way to simplify derivation:API Usage
Alternative Designs
An alternate path would be to re-route the calls between methods and provide more virtual overrides in the existing base classes. For compat, this would be opt-in behavior via a constructor parameter or virtual property. The upside is that we avoid introducing new types. Another upside is that framework classes could use the new behavior easily and leverage the toggle to swap back to the old behavior whenever they are derived from. The downside is that it may be difficult for consumers to understand how to leverage the new functionality.
Some other considerations for the general design:
Stream
authors should always consider this an make it required instead.A problem that remains in the current design is that if you do want seek or timeout behavior you have to know about and implement a specific subset of the optional virtual methods. We could unify those into a single method like so:
Risks
Fragmentation might increase complexity/confusion instead of decreasing it, although docs could help with this.
The text was updated successfully, but these errors were encountered: