Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HttpClient (using SocketsHttpHandler) may cause unobserved IOException that cannot be caught #104324

Closed
egraff opened this issue Jul 2, 2024 · 3 comments · Fixed by #104335
Closed
Assignees
Milestone

Comments

@egraff
Copy link

egraff commented Jul 2, 2024

Description

We've seen a crash from one of our production applications due to an IOException being emitted to the application's TaskScheduler.UnobservedTaskException event handler. The IOException wraps a SocketException, and is related to the use of an HttpClient to do a GET request. The application is using a vanilla HttpClient (which internally will use a SocketsHttpHandler which in turn uses an HttpConnectionPool), and the call to HttpClient.GetAsync() is protected by a try-catch that catches all exceptions. The unobserved exception came from an internal _readAheadTask that is owned by System.Net.Http.HttpConnection, and the exception did not propagate up the stack back to the caller of HttpClient.GetAsync().

The call stack looks like this:

System.IO.IOException: Unable to read data from the transport connection: An existing connection was forcibly closed by the remote host..
 ---> System.Net.Sockets.SocketException (10054): An existing connection was forcibly closed by the remote host.
   at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.CreateException(SocketError error, Boolean forAsyncThrow)
   at System.Net.Sockets.NetworkStream.ReadAsync(Memory`1 buffer, CancellationToken cancellationToken)
   at System.Net.Http.HttpConnection.PrepareForReuse(Boolean async)
   at System.Net.Http.HttpConnectionPool.TryGetPooledHttp11Connection(HttpRequestMessage request, Boolean async, HttpConnection& connection, HttpConnectionWaiter`1& waiter)
   at System.Net.Http.HttpConnectionPool.SendWithVersionDetectionAndRetryAsync(HttpRequestMessage request, Boolean async, Boolean doRequestAuth, CancellationToken cancellationToken)
   at System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[TStateMachine](TStateMachine& stateMachine)
   at System.Net.Http.HttpConnectionPool.SendAsync(HttpRequestMessage request, Boolean async, Boolean doRequestAuth, CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPoolManager.SendAsyncCore(HttpRequestMessage request, Uri proxyUri, Boolean async, Boolean doRequestAuth, Boolean isProxyConnect, CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPoolManager.SendAsync(HttpRequestMessage request, Boolean async, Boolean doRequestAuth, CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionHandler.SendAsync(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
   at System.Net.Http.HttpMessageHandlerStage.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
   at System.Net.Http.RedirectHandler.SendAsync(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
   at System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[TStateMachine](TStateMachine& stateMachine)
   at System.Net.Http.RedirectHandler.SendAsync(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
   at System.Net.Http.SocketsHttpHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
   at System.Net.Http.HttpClient.<SendAsync>g__Core|83_0(HttpRequestMessage request, HttpCompletionOption completionOption, CancellationTokenSource cts, Boolean disposeCts, CancellationTokenSource pendingRequestsCts, CancellationToken originalCancellationToken)
   at System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[TStateMachine](TStateMachine& stateMachine)
   at System.Net.Http.HttpClient.GetAsync(Uri requestUri)

Note that the _readAheadTask is created by HttpConnection.PrepareForReuse(Boolean async), where there is a sync-async disassociation (PrepareForReuse() is synchronous, but NetworkStream.ReadAsync() is async, and it is the returned ValueTask from NetworkStream.ReadAsync() that is assigned to _readAheadTask). The problem is that for this particular issue and code path, _readAheadTask was not awaited or checked for completion in a way that would observe any (possible) exception, so that exception was later emitted to TaskScheduler.UnobservedTaskException after the ValueTask was garbage collected.

Reproduction Steps

I have tried to create a stress test to reproduce the issue, but have so far been unsuccessful, because it is very difficult to recreate socket/transport problems when dealing with real sockets, and it is not possible to simulate as the socket cannot be mocked out. The exception is likely caused by either a transient network problem, or a connection timing out in just the right way or at just the right time to cause an exception when NetworkStream.ReadAsync() is called by PrepareForReuse().

However, I do believe I see a problem with the implementation of PrepareForReuse() that could explain why we end up with an unobserved exception. This is described further down, under "other information".

Expected behavior

The exception should be observed by the internal implementation. Either by logging and forgetting the task exceptions – there is already another code path in HttpConnection that does this, see

if (_readAheadTask != default)
{
Debug.Assert(_readAheadTaskStatus == ReadAheadTask_CompletionReserved);
LogExceptions(_readAheadTask.AsTask());
}

– or by propagating the exception back to the caller (depending on whether the exception occurs in context of a stack frame where it is appropriate and/or possible to do this).

There should be no unobserved exceptions.

Actual behavior

An unobserved internal socket exception might be emitted to TaskScheduler.UnobservedTaskException when garbage collected, which may cause an application to crash.

Regression?

No response

Known Workarounds

No known workarounds.

Configuration

  • Seen on .NET 8.0.3
  • The application was running on Windows Server 2019.
  • The code was compiled for x64

Other information

After some investigation, this is my guess as to what is causing the problem:

The method TryGetPooledHttp11Connection() in HttpConnectionPool contains this code:

if (!connection.PrepareForReuse(async))
{
if (NetEventSource.Log.IsEnabled()) connection.Trace("Found invalid HTTP/1.1 connection in pool.");
connection.Dispose();
continue;
}

where the relevant bit of PrepareForReuse() from HttpConnection is this:

Debug.Assert(_readAheadTaskStatus == ReadAheadTask_NotStarted);
_readAheadTaskStatus = ReadAheadTask_CompletionReserved;
// Perform an async read on the stream, since we're going to need to read from it
// anyway, and in doing so we can avoid the extra syscall.
try
{
#pragma warning disable CA2012 // we're very careful to ensure the ValueTask is only consumed once, even though it's stored into a field
_readAheadTask = _stream.ReadAsync(_readBuffer.AvailableMemory);
#pragma warning restore CA2012
return !_readAheadTask.IsCompleted;
}
catch (Exception error)
{
// If reading throws, eat the error and don't reuse the connection.
if (NetEventSource.Log.IsEnabled()) Trace($"Error performing read ahead: {error}");
return false;
}

Note that if PrepareForReuse() returns false, then the HttpConnection is disposed (and note also that there is no code in HttpConnection.Dispose() that would check the _readAheadTask for any exceptions). And for this code path, PrepareForReuse() would return false if _readAheadTask.IsCompleted is true. For the happy path, this is not a problem, but note that Task.IsCompleted may imply Task.IsFaulted. So if the call to NetworkStream.ReadAsync() completes synchronously and returns a Task that is in a faulted state, then the associated exception would be unobserved.

At the moment I cannot prove that this is the exact scenario that has happened in my case, but it seems likely that is could be. If it is possible (and I do believe it is possible) for NetworkStream.ReadAsync() to synchronously return a task that is in a faulted state (because of e.g. an IOException), then it seems the code in HttpConnectionPool and HttpConnection does not properly take this into account.

As for why I think it is possible for ReadAsync() to return a task that is already in a faulted state, consider this code:

ValueTask.FromException<int>(CreateException(error));

@dotnet-policy-service dotnet-policy-service bot added the untriaged New issue has not been triaged by the area owner label Jul 2, 2024
Copy link
Contributor

Tagging subscribers to this area: @dotnet/ncl
See info in area-owners.md if you want to be subscribed.

@wfurt
Copy link
Member

wfurt commented Jul 2, 2024

probably related to #103257, cc @MihaZupan
If you can reproduce it could you try 9.0 daily build (or next preview) @egraff ?

@MihaZupan MihaZupan added the bug label Jul 2, 2024
@MihaZupan MihaZupan added this to the 9.0.0 milestone Jul 2, 2024
@MihaZupan MihaZupan removed the untriaged New issue has not been triaged by the area owner label Jul 2, 2024
@MihaZupan
Copy link
Member

We had a similar issue before 8.0, where we could leak unobserved task exceptions if the _readAheadTask was started via the connection pool scavenging (#61256).
That particular issue was fixed by #80214, but while I was focused on the scavenging part, we now have a similar issue for the case where the _readAheadTask was started via PrepareForReuse (normally it'd just be a boring EOF, but it could be an exception, in which case we should observe it).

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants