-
Hi all, I will preface this by saying I am by no means an expert in load testing or .NET web server configuration, and I am doubting this is an issue with YARP specifically. Hoping you can shed some light on the situation and guide me in the right direction please. IssueWe are performing load testing to ensure our YARP deployment can handle the request throughput we require. For our load test, we have YARP running on .NET Core 7 in a Docker container, proxying to a dummy .NET Core 7 Kestrel web server with a single GET endpoint, that delays for 1 second before returning the current datetime. Under low load scenarios, we see that adding YARP into the request pipeline adds approximately up to tens of milliseconds worth of latency, so the overall request takes approximately (just over) 1 second. We have added a HttpClientTelemetryConsumer and a ForwarderTelemetryConsumer and are happy to see logs being made at each stage of the request (represented by the dots on the root span) However, under a certain load, response times begin to climb towards 3 seconds, so approximately ~2 seconds of latency added by the proxy (or perhaps more precisely, with the proxy in the request pipeline). What is interesting is that the last log we see indicates OnRequestStart is being called in HttpClientTelemetryConsumer (confusingly within the first This query measures the duration of the outgoing HTTP client call from YARP - the bottom graph depicts the climb from 1s to ~2.6s: What we’ve triedWe are using AWS and have experimented with scaling either horizontally or vertically. We’ve tested with 12 servers with 0.25vCPU and 1GB of memory and seen no issues: We then tested with 6 servers with 2vCPU and 4GB of memory, resulting in the ~2 seconds of additional latency I mentioned earlier: These results seem like a clue to me; despite upping the hardware by quite a bit, there is some resource that is more plentiful with 12 boxes. It also appears that the proxy’s CPU and Memory util are OK - maybe there are some bottlenecks I haven’t seen in our AWS metrics such as running out of CPU threads or hitting some connection limit. We think we have ruled the dummy web server out by:
At this stage, our next step will be to try and get some HTTP Metrics out. We want to inspect the queue length as we noticed there is a limit of 1024 requests and the first log we don't see is for Any advice around how to rule YARP in or out, or how to configure YARP to solve this would be greatly appreciated!! Thank you in advance, |
Beta Was this translation helpful? Give feedback.
Replies: 5 comments 5 replies
-
I should also add, I don't think we have any rate limiting applied to our routes (as no rate limiting appears to be the default), though I am struggling to explicitly disable it and test.
I've raised an issue here. |
Beta Was this translation helpful? Give feedback.
-
What does your YARP configuration look like - are you using the configuration file / direct forwarding, any custom client configuration, etc.? You mentioned seeing the |
Beta Was this translation helpful? Give feedback.
-
It looks like we are using HTTP/2. I don't know much about this but it sounds like HTTP/2 is more performant.
Thanks, that is good to know.
Sure, our config looks like this: Cluster config - I've trimmed it down a bit to share here, but we actually have 5 routes and 5 clusters (that only differ in name and target). Only one route & cluster is being invoked during the test. public static IReverseProxyBuilder LoadProxyConfig(this IReverseProxyBuilder builder)
{
var clusterId = "exampleCluster";
builder.LoadFromMemory(new[]
{
new RouteConfig
{
RouteId = "exampleRoute",
ClusterId = clusterId,
Match = new RouteMatch
{
Hosts = new [] {"exampleloadtestdomain.example.com" },
}
}
},
new[]
{
new ClusterConfig
{
ClusterId = clusterId,
HttpRequest = new ForwarderRequestConfig() { ActivityTimeout = TimeSpan.FromSeconds(900) }, //long timeout to match one proxied system's timeout
Destinations = new Dictionary<string, DestinationConfig>(StringComparer.OrdinalIgnoreCase)
{
{ "destination1", new DestinationConfig() { Address = "www.example.com" } }
}
}
});
return builder;
} and the CustomForwarderHttpClientFactory implementation: /**
* ref: https://microsoft.github.io/reverse-proxy/articles/http-client-config.html#custom-iforwarderhttpclientfactory
*/
public class CustomForwarderHttpClientFactory : IForwarderHttpClientFactory
{
public HttpMessageInvoker CreateClient(ForwarderHttpClientContext context)
{
var handler = new SocketsHttpHandler
{
UseProxy = false,
AllowAutoRedirect = false,
AutomaticDecompression = DecompressionMethods.None,
UseCookies = false,
ActivityHeadersPropagator = new ReverseProxyPropagator(DistributedContextPropagator.Current),
ResponseHeaderEncodingSelector = (_, _) => Encoding.UTF8
};
var invoker = new HttpMessageInvoker(handler, disposeHandler: true);
return invoker;
}
}
Debugging locally I can see that As for the I have actually found a couple that do have logs at the end (but not the start...)! so something is definitely off with data missing from our OpenTelemetry back end for some reason. Anyway, this might be useful to look at! PerRequestMetrics: {
"StartTime": "2023-10-17T02:19:22.0820696Z",
"RouteInvokeOffset": 0.1638,
"ProxyStartOffset": 0.1685,
"HttpRequestStartOffset": 1.1,
"HttpConnectionEstablishedOffset": 0,
"HttpRequestLeftQueueOffset": 1109.6377,
"HttpRequestHeadersStartOffset": 1110.1027,
"HttpRequestHeadersStopOffset": 1110.2028,
"HttpRequestContentStartOffset": 0,
"HttpRequestContentStopOffset": 0,
"HttpResponseHeadersStartOffset": 1110.22,
"HttpResponseHeadersStopOffset": 2114.195,
"HttpResponseContentStopOffset": 2114.2544,
"HttpRequestStopOffset": 2114.211,
"ProxyStopOffset": 2114.26,
"Error": 0,
"RequestBodyLength": 0,
"ResponseBodyLength": 69,
"RequestContentIops": 0,
"ResponseContentIops": 2,
"DestinationId": "destination1",
"ClusterId": "adminCluster",
"RouteId": "adminRoute"
} Request 2 - this one has a slightly different distribution of where waits might be happening?: PerRequestMetrics: {
"StartTime": "2023-10-17T02:19:00.6066264Z",
"RouteInvokeOffset": 0.5633,
"ProxyStartOffset": 0.5675,
"HttpRequestStartOffset": 269.0689,
"HttpConnectionEstablishedOffset": 0,
"HttpRequestLeftQueueOffset": 1085.0881,
"HttpRequestHeadersStartOffset": 1085.097,
"HttpRequestHeadersStopOffset": 1089.8484,
"HttpRequestContentStartOffset": 0,
"HttpRequestContentStopOffset": 0,
"HttpResponseHeadersStartOffset": 1089.8533,
"HttpResponseHeadersStopOffset": 2093.7458,
"HttpResponseContentStopOffset": 2574.9463,
"HttpRequestStopOffset": 2473.7656,
"ProxyStopOffset": 2574.9622,
"Error": 0,
"RequestBodyLength": 0,
"ResponseBodyLength": 69,
"RequestContentIops": 0,
"ResponseContentIops": 2,
"DestinationId": "destination1",
"ClusterId": "adminCluster",
"RouteId": "adminRoute"
}
I will have a go at digging these out, as well as enabling DEBUG logging to see what comes up. Thanks for the examples and suggestions Miha, it is greatly appreciated. |
Beta Was this translation helpful? Give feedback.
-
This post has been really enlightening to me dotnet/runtime#35088 |
Beta Was this translation helpful? Give feedback.
-
OK I think I am onto something here... To recap, we have 6 proxies proxying to 12 mock servers via an AWS Application Load Balancer. Since we are seeing request queuing, we know that MAX_CONCURRENT_STREAMS is being reached. It seems like the default would be 100 but I couldn't find a better reference than this unfortunately. I could see Kestrel's Limits.Http2.MaxStreamsPerConnection defaults to 100 but I think that is inbound. We have 12 mock servers, so what gives? shouldn't we theoretically have 6 (proxies) * 100 (streams) * 12 (mock targets) = 7200 concurrent requests without queuing? well, because our back end target is just an AWS Application Load Balancer, that is considered the same server despite having 12 servers behind the scenes. Each of our proxies has 1 connection and is maxing the 100 streams within. The I am yet to dig into why we are only achieving 1300/second despite a target concurrency of 3000/second. Will respond when I know more. I'm also yet to understand the implications of enabling this setting, and whether this is even relevant/required for production, where we will have several target load balancers rather than just the one in this fabricated scenario. Thank you! feeling quite relieved here and nearly certain this isn't a YARP issue. |
Beta Was this translation helpful? Give feedback.
OK I think I am onto something here...
To recap, we have 6 proxies proxying to 12 mock servers via an AWS Application Load Balancer.
Since we are seeing request queuing, we know that MAX_CONCURRENT_STREAMS is being reached. It seems like the default would be 100 but I couldn't find a better reference than this unfortunately. I could see Kestrel's Limits.Http2.MaxStreamsPerConnection defaults to 100 but I think that is inbound.
We have 12 mock servers, so what gives? shouldn't we theoretically have 6 (proxies) * 100 (streams) * 12 (mock targets) = 7200 concurrent requests without queuing? well, because our back end target is just an AWS Application Load Balancer, that is considered the sam…