-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HTTP2: Client and server disagree on active stream count #1586
Comments
From @jkotalik on Saturday, August 24, 2019 7:45:51 PM So dotnet/aspnetcore#12704 was supposed to fix this issue. There may be a case we aren't covering. |
From @Tratcher on Saturday, August 24, 2019 8:10:53 PM Indeed. dotnet/corefx#12704 was focused on the server terminating the request, but the new repro is focused on client aborts. A quick look through the code shows we did account for client RSTs, but maybe we missed something else. |
From @JamesNK on Monday, August 26, 2019 11:07:11 PM This is a bug but not a critical one. 3.1 fix? |
From @Tratcher on Wednesday, August 28, 2019 10:49:15 PM Theory: The client has a race where it can cancel a stream, decrement its count, but not send a RST_STREAM. The server still considers the stream open and you exceed the stream limit. Repro. I set the connection limit to 1, send two requests, and cancel one of them (the delay=100 request).
AppContext.SetSwitch("System.Net.Http.SocketsHttpHandler.Http2UnencryptedSupport", true);
using var client = new HttpClient
{
BaseAddress = new Uri("http://localhost:5005"),
DefaultRequestVersion = HttpVersion.Version20
};
for (int i = 0; i < 1000; i++)
{
// Send a second one that will cancel
var r2Task = client.GetAsync("?delay=100", new CancellationTokenSource(50).Token);
// Send one request normally
var r1Task = client.GetAsync("?delay=0");
try
{
var r1 = await r1Task;
r1.EnsureSuccessStatusCode();
}
catch (Exception ex)
{
Console.WriteLine("R1 fail: " + ex.Message);
}
try
{
var r2 = await r2Task;
r2.EnsureSuccessStatusCode();
}
catch (OperationCanceledException)
{
Console.WriteLine("R2 canceled");
}
catch (Exception ex)
{
Console.WriteLine("R2 fail: " + ex);
}
} app.Run(async context =>
{
var delay = context.Request.Query["delay"].ToString();
if (!string.IsNullOrEmpty(delay))
{
await Task.Delay(int.Parse(delay));
}
await context.Response.WriteAsync($"Hello World! {context.Request.Protocol} {delay}");
}); |
@Tratcher In your repro, when you say "I set the connection limit to 1" what does this mean? I'm assuming you are configuring Kestrel to set MAX_CONCURRENT_STREAMS to 1, correct? How do you do this? |
Yeah, it looks like there is a race here, introduced by the fix for dotnet/corefx#40180. When the request is cancelled, and the request body is already complete (as is the case here, since it's a GET), we are now decrementing the stream count before we send the RST_STREAM. This means there's a window where if another request is waiting on stream count, it can start sending before the RST_STREAM is sent. See here: https://github.com/dotnet/corefx/blob/master/src/System.Net.Http/src/System/Net/Http/SocketsHttpHandler/Http2Stream.cs#L377 We need to defer the call to Complete until after we send the RST_STREAM. |
There are two other places we call SendReset as well, aside from the one I linked to above. All of these have the same issue and will need to be fixed. Basically, whenever we call SendReset, we should avoid calling Complete before we do so, and then call it after the SendReset instead. |
Yes, the streams per connection limit james linked to. |
FWIW this reproduces immediately once I constrain kestrel MaxStreamsPerConnections in the corefx stress suite. |
@geoffkizer Will you be able to get a fix into master and then release/3.0? The fix for dotnet/corefx#40180 already went into release/3.0. So, if that fix regresses something then we need to fix the fix. |
Sure, I will put together a fix for it. @JamesNK Can you send me the original repro? |
|
@JamesNK Thanks. How long does it usually take to repro? |
I just ran it and it repoed almost instantly.
|
When I try to run the repro, I get: Unhandled exception. System.Net.Http.HttpRequestException: The SSL connection could not be established, see inner exception. |
New documentation Hot Off the Press just for your problem: |
It looks like the client is already doing that, but I'm still getting an SSL error: HttpClientHandler handler = new HttpClientHandler();
handler.ServerCertificateCustomValidationCallback = delegate { return true; };
var httpClient = new HttpClient(handler)
{
BaseAddress = new Uri(url),
}; |
Hmmm, I don't know then. Just to get this sample working, did you try trusting the ASP.NET Core certificate? Is there a bug in HttpClient around ignoring invalid certificates in HTTP/2? |
Not that I'm aware of. Our tests do this.
How do I do that? |
It's in the documentation I linked. |
Hmm, where do you see that code? The only place I see HttpClient created is in GetClientFactory, Program.cs, line 259:
|
Yeah, you are right. I had an old editor window open and got confused and looked at the wrong code for GrpcGreeterClient. Apologies for the confusion... |
I can now get it to fail in about 5-10 seconds, but what I'm seeing is different that what you describe above: Unhandled exception. System.Net.Http.HttpRequestException: An error occurred while sending the request. |
That's what you get when too many requests are cancelled but the app hasn't had a chance to finish cleaning them up yet. We have another thread discussing that. Did you try the repro code I gave above? It avoided that issue. https://github.com/dotnet/corefx/issues/40663#issuecomment-525953428 |
@geoffkizer are you actively working on it? |
I have the fix ready. Still trying to validate it using stress and confirm that we're no longer seeing this. I'll aim to get a PR out relatively soon. |
In case it's useful to you for this issue, I had one of the ASP.NET Core tests (as of dotnet/aspnetcore@5669e5e) fail locally with this error:
|
Moving to 6.0 and unassigning. |
Update 2: Update: --- Original post --- I still have this error also with
I'm using |
From @JamesNK on Saturday, August 24, 2019 2:24:08 AM
Issue found during HTTP2 stress testing. One HttpClient is sending multiple requests in parallel, then canceling them via RST_STREAM sent to the server. For an unknown reason the server and client disagree on the active stream count, and the sever eventually sends RST_STREAM (REFUSED_STREAM) and GOAWAY (STREAM_CLOSED). This issue eventually leads to dotnet/aspnetcore#13405.
Client and server should agree on the active stream count and the client should wait to send new requests, without error.
Wireshark log: https://www.dropbox.com/s/5ndspth4t0qjyj6/clientreset-error.zip?dl=0
Exception on the client:
Repo is currently not public. Contact @JamesNK or @Tratcher for the repo source code and instruction.
Copied from original issue: dotnet/aspnetcore#13406
The text was updated successfully, but these errors were encountered: