Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Efficient telemetry for requests that are rejected at Kestrel layer. #33949

Closed
avparuch opened this issue Jun 29, 2021 · 9 comments · Fixed by #34783
Closed

Efficient telemetry for requests that are rejected at Kestrel layer. #33949

avparuch opened this issue Jun 29, 2021 · 9 comments · Fixed by #34783
Assignees
Labels
area-networking Includes servers, yarp, json patch, bedrock, websockets, http client factory, and http abstractions feature-kestrel partner Partner ask
Milestone

Comments

@avparuch
Copy link
Contributor

avparuch commented Jun 29, 2021

When running on Kestrel, requests typically fall into two buckets:

  1. Requests that pass through Kestrel and reach the application middleware.
  2. Requests that are rejected within Kestrel for various reasons and never reach the application layer.

This is a feature request to enable efficient telemetry/logging for the 2nd category to increase observability into such failures.

Let's take a look at IIS/Http.Sys's approach to such failures:

In the case of Http.Sys, from https://learning.oreilly.com/library/view/internet-information-services/9780735624412/ch15s06.html, "HTTPERR log records all errors that are not handed off to a valid worker process, typically responses to clients, connection time-outs, and orphaned requests." In other words, requests that are not serviced by the application/rejected at the Http.Sys layer are logged.

Schema of a typical Http.Sys error log looks like this : https://docs.microsoft.com/en-us/troubleshoot/aspnet/error-logging-http-apis#format-of-the-http-api-error-logs

Similarly, in IIS, it is possible to configure FailedRequestTracing to log requests that fail in IIS layer and never reach the application middleware : https://www.jeremymorgan.com/blog/iis-devops/failed-request-tracing-iis/

A concrete example of where such logging is useful:

When Kestrel receives an authority form request where the request start line and host header are mismatched, Kestrel throws the following exception and returns a 400.0 (Stacktrace shows "google.com because that's the Host header I sent, point is valid for any mismatch). This request does not enter the application middleware and there is no observability for this error (unless Trace logging is turned on).

Microsoft.AspNetCore.Server.Kestrel.Core.BadHttpRequestException: Invalid Host header: 'google.com'
   at Microsoft.AspNetCore.Server.Kestrel.Core.KestrelBadHttpRequestException.Throw(RequestRejectionReason reason, String detail)
   at Microsoft.AspNetCore.Server.Kestrel.Core.Internal.Http.Http1Connection.ValidateNonOriginHostHeader(String hostText)
   at Microsoft.AspNetCore.Server.Kestrel.Core.Internal.Http.Http1Connection.EnsureHostHeaderExists()
   at Microsoft.AspNetCore.Server.Kestrel.Core.Internal.Http.Http1Connection.TryParseRequest(ReadResult result, Boolean& endConnection)
   at Microsoft.AspNetCore.Server.Kestrel.Core.Internal.Http.HttpProtocol.ProcessRequests[TContext](IHttpApplication`1 application)
   at Microsoft.AspNetCore.Server.Kestrel.Core.Internal.Http.HttpProtocol.ProcessRequestsAsync[TContext](IHttpApplication`1 application)

Specifically, this particular error is coming from this line : https://github.com/dotnet/aspnetcore/blob/be5ba495b4872fa055385463da655ae4c4318898/src/Servers/Kestrel/Core/src/Internal/Http/Http1Connection.cs#L606-607

Repro:

  1. Download ASP.NET Core sample app and fire up Kestrel in Visual Studio.

  2. Use any client (I used wrk2), send a authority form request to Kestrel with mismatched HostHeader and request start line as shown below.
    snip_400

  3. Your client will observe a 400.0 Status Code, but on the Server side, since the request does not enter the application middleware, there is no logging of this error.

The above is just one example, there are other examples (please let me know if you'd want me to provide more examples) where requests are terminated/rejected within Kestrel code and not handed off to application middleware.

The ask is to have functionality, where requests that are not serviced by the application middleware are logged efficiently. This telemetry will be hugely beneficial to critical large-scale distributed services.

@joaocpaiva
Copy link

This feature would be useful for multiple services. Having visibility in the logs pipeline to requests that are rejected by Kestrel, is very important for telemetry, debuggability and diagnostics.

While migrating to Kestrel, we found a few regressions (e.g. headers unsupported encoding) that were not detected ahead of time because there were no signs of them in the logs.

@halter73
Copy link
Member

halter73 commented Jun 30, 2021

Do you want an EventSource event(s), specific log event(s) (or category) or something else? Would we want events for anything that looks like requests but don't get to middleware for whatever reason (e.g. invalid host header, unexpected character when parsing, etc...)? Do you want something specific for just invalid host headers?

We do currently have the ConnectionBadRequest log for all invalid requests on an otherwise healthy connection (Note: Middleware might still have run for a something logged as ConnectionBadRequest if there was an invalid request body. For anything earlier, middleware is not run.), and RequestProcessingError for I/O errors during request processing. Starting in .NET 6, both will be under the new Microsoft.AspNetCore.Server.Kestrel.BadRequests logger category.

We could consider trying to formalize Kestrel's internal RequestRejectionReasons as part of this but I'm hesitant given we've gotten by without doing so this far. A lot of the reasons are specific to a certain versions of HTTP and not something you'd ever expect to see from non-buggy HTTP clients. We've found that logging a message describing the problem is sufficient for most use cases. I don't know why any app-level logic would care about things like whether the client sent a BadChunkSuffix or BadChunkSizeData. This might be interesting for diagnostic purposes, but not much else.

In any case, what details would you want for these events? The less specific the event, the harder it will be to always provide details, but maybe some of the details could be optional.

@avparuch
Copy link
Contributor Author

avparuch commented Jun 30, 2021

@halter73 EventSource event or a callback would be great. Regarding the schema for these error events, the Http.sys error log schema is a pretty good starting point. https://docs.microsoft.com/en-us/troubleshoot/aspnet/error-logging-http-apis#format-of-the-http-api-error-logs

Any requests that are not handed off to the worker process (for various reasons) are all logged in the above schema. Most of the schema in the above link is self-explanatory, except for "Reason phrase". (possible values for "Reason phrase" are listed below, (not exhaustive)) :

Verb
Connection_Dropped
Timer_ConnectionIdle
BadRequest
NotFound
Timer_EntityBody
Timer_MinBytesPerSecond
URL
Header
Client_Reset
Connection_Abandoned_By_ReqQueue
Hostname
RequestLength
Forbidden
Timer_HeaderWait
URL_Length
LengthRequired
Version_N/S
Timer_SslRenegotiation
N/A
Request_Cancelled
N/I
Disabled
Number
Connection_Dropped_List_Full
QueueFull

Link to reasons and the explanation : https://docs.microsoft.com/en-us/troubleshoot/aspnet/error-logging-http-apis#kinds-of-errors-that-the-http-api-logs

The Http.Sys error schema does not translate as-is to Kestrel obviously, but I believe the below schema would contain what I'm looking for in a error log for requests/connections that are not handed off to the application middleware. As you mentioned, for some of the requests (connections), these could be well be optional/empty.

TimeStamp, ClientIP, Client Port, ServerIP, Server Port, Host Header, Verb, Protocol, Uri, StatusCode, Reason.

@shirhatti
Copy link
Contributor

Based on our offline discussion:

Let's look at emitting new events from KestrelEventSource for failure conditions. We'll introduce a new DTO for logging that'll have the fields based on @avparuch's suggestion

TimeStamp, ClientIP, Client Port, ServerIP, Server Port, Server Name (from SNI), Host Header, Verb, Protocol, Uri, StatusCode, Reason.

  • Reason phrase is probably just the stringified version of the RequestRejectionReason enum
  • Fields in the DTO may be null (or have a default/sentinel value).
  • We'll re-emit all the fields we emit in the ConnectionStart event.
  • We may have duplication between Events raised here and those observed by middleware. We're not making any efforts to de-dupe.

Open questions:

  • Is adding an extra parameter to the WriteEvent call considered breaking? We'd rely on this behavior to add more info to logs in the future
  • How would the W3CLogger consume this? Via a custom EventListener?

Given that this doesn't require new API surface, we can consider doing this in 6.0 after the preview7 release.

@halter73
Copy link
Member

halter73 commented Jul 1, 2021

What do we want to name the event? RequestRejected?

  • We may have duplication between Events raised here and those observed by middleware. We're not making any efforts to de-dupe.

I think this is okay, but I also think that only raising these invents if we don't invoke middleware is okay. The only difference is whether we raise the event for request body errors. Only raising the event for requests that are rejected before reaching middleware might make it easier to consume from something like W3CLogger.

@JamesNK
Copy link
Member

JamesNK commented Jul 1, 2021

What does DTO mean here? EventSource only supports primitive types. We will just have a list of primitive type arguments for the event source method.

Reason phrase is probably just the stringified version of the RequestRejectionReason enum

As long as it has enough information to communicate to the log reader about why the request was rejected. Might need a field for additional detail (which I'm sure will be ripe for abuse over the years from now 😄).

@JamesNK
Copy link
Member

JamesNK commented Jul 1, 2021

If a request makes it to middleware, does that mean it is accepted? So would that mean this event would only fire for requests rejected before middleware?

Also, if a request makes it to middleware then it is simple enough to write some middleware that catches any exceptions and logs them. An event like this isn't as important once the request has made it that far.

@Tratcher
Copy link
Member

Tratcher commented Jul 1, 2021

What do we want to name the event? RequestRejected?

Or more general BadRequest?

If a request makes it to middleware, does that mean it is accepted? So would that mean this event would only fire for requests rejected before middleware?

Also, if a request makes it to middleware then it is simple enough to write some middleware that catches any exceptions and logs them. An event like this isn't as important once the request has made it that far.

A request can make it to middleware and still hit a few bad request conditions like bad chunked formatting in the body. Yes those should be reported to the application as an Exception from Stream.ReadAsync, though intercepting that at all points in the app might require wrapping the stream.

@BrennanConroy
Copy link
Member

Triage: We can add a rejected reason phrase with failure details to a new feature and then pass the feature through a DiagnosticSource

@BrennanConroy BrennanConroy added this to the 6.0-rc1 milestone Jul 7, 2021
@ghost ghost locked as resolved and limited conversation to collaborators Sep 9, 2021
@amcasey amcasey added area-networking Includes servers, yarp, json patch, bedrock, websockets, http client factory, and http abstractions and removed area-runtime labels Jun 2, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-networking Includes servers, yarp, json patch, bedrock, websockets, http client factory, and http abstractions feature-kestrel partner Partner ask
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants