Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add request metrics to System.Net.Http #85447

Closed
wants to merge 4 commits into from
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions src/libraries/System.Net.Http/ref/System.Net.Http.cs
Original file line number Diff line number Diff line change
Expand Up @@ -214,6 +214,9 @@ public partial class HttpMessageInvoker : System.IDisposable
public HttpMessageInvoker(System.Net.Http.HttpMessageHandler handler) { }
public HttpMessageInvoker(System.Net.Http.HttpMessageHandler handler, bool disposeHandler) { }
public void Dispose() { }
#pragma warning disable CS3003 // Type is not CLS-compliant
public System.Diagnostics.Metrics.Meter Meter { get { throw null; } set { } }
#pragma warning restore CS3003 // Type is not CLS-compliant
protected virtual void Dispose(bool disposing) { }
[System.Runtime.Versioning.UnsupportedOSPlatformAttribute("browser")]
public virtual System.Net.Http.HttpResponseMessage Send(System.Net.Http.HttpRequestMessage request, System.Threading.CancellationToken cancellationToken) { throw null; }
Expand Down Expand Up @@ -263,6 +266,7 @@ public HttpRequestMessage(System.Net.Http.HttpMethod method, System.Uri? request
[System.ObsoleteAttribute("HttpRequestMessage.Properties has been deprecated. Use Options instead.")]
public System.Collections.Generic.IDictionary<string, object?> Properties { get { throw null; } }
public HttpRequestOptions Options { get { throw null; } }
public ICollection<KeyValuePair<string, object?>> MetricsTags { get { throw null; } }
public System.Uri? RequestUri { get { throw null; } set { } }
public System.Version Version { get { throw null; } set { } }
public System.Net.Http.HttpVersionPolicy VersionPolicy { get { throw null; } set { } }
Expand Down
1 change: 1 addition & 0 deletions src/libraries/System.Net.Http/src/System.Net.Http.csproj
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,7 @@
<Compile Include="System\Net\Http\HttpMessageHandler.cs" />
<Compile Include="System\Net\Http\HttpMessageInvoker.cs" />
<Compile Include="System\Net\Http\HttpMethod.cs" />
<Compile Include="System\Net\Http\HttpMetrics.cs" />
<Compile Include="System\Net\Http\HttpParseResult.cs" />
<Compile Include="System\Net\Http\HttpProtocolException.cs" />
<Compile Include="System\Net\Http\HttpRequestException.cs" />
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,9 @@
// Licensed to the .NET Foundation under one or more agreements.
// The .NET Foundation licenses this file to you under the MIT license.

using System.Diagnostics;
using System.Diagnostics.CodeAnalysis;
using System.Diagnostics.Metrics;
using System.IO;
using System.Net.Http.Headers;
using System.Runtime.Versioning;
Expand All @@ -14,6 +17,26 @@ public class HttpMessageInvoker : IDisposable
private volatile bool _disposed;
private readonly bool _disposeHandler;
private readonly HttpMessageHandler _handler;
private Meter? _meter;
private HttpMetrics? _metrics;

#pragma warning disable CS3003 // Type is not CLS-compliant
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be able to use [CLSCompliant(false)] instead of these pragmas

public Meter Meter
{
// TODO: Should the Meter and HttpMetrics be static and shared by default?
get => _meter ??= new Meter("System.Net.Http");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should make sure only one instance is ever created in a race.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree the property isn't thread-safe, but how much do we care about thread safety? It should only be set when the message invoker is created.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way I read it this field may be null when you start sending requests, and the SendAsync will lazy-init it, no?

If you try sending multiple requests at the same time, you will end up creating more than one instance with the ??=.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd go farther, we should use a static so there is only one instance (in the non-DI case). Creating a new instance for every client adds a bunch of unnecessary overhead.

set
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tarekgh @JamesNK - I mentioned in an earlier email but it may have gotten missed, what do you guys think about consuming an IMeterFactory instead of a Meter? Then the Meter can be produced:

   _meter = _factory == null ? new Meter("System.Net.Http") : factory.CreateMeter("System.Net.Http");

This eliminates the need for the caller to understand what meter name is expected.

This implies that IMeterFactory gets defined in System.Diagnostics.DiagnosticSource.dll which I don't think would be an issue but I might be overlooking something.

Copy link
Member Author

@JamesNK JamesNK May 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was expecting IMeterFactory to be in Microsoft.Extensions.Metrics and so not available here. That's part of why I chose this approach.

The other reason for exposing the meter is it makes consuming the metrics for just one client quite simple:

var invoker = new HttpMessageInvoker();
var currentRequestsRecorder = new InstrumentRecorder<double>(invoker.Meter, "request-duration");

But that only works if each client/invoker has its own Meter and counters copy, which might not be desirable.

Copy link
Member

@noahfalk noahfalk May 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was expecting IMeterFactory to be in Microsoft.Extensions.Metrics and so not available here

Initially I imagined it there too. Do you feel like it is important for IMeterFactory to not be in System.Diagostics.Metrics?

The other reason for exposing the meter is it makes consuming the metrics for just one client quite simple:

If users were injecting a factory instead the pattern wouldn't be much different:

var IMeterFactory factory = ...
var invoker = new HttpMessageInvoker();
invoker.Factory = factory; // separately, maybe we should be using a constructor overload rather than a property setter?
var currentRequestsRecorder = new InstrumentRecorder<double>(factory, "System.Net.Http", "request-duration");

If the app developer wasn't injecting the factory and using the default Meter then I wouldn't expect them to write tests for that as it would effectively just be testing runtime behavior and not their own code.

Copy link
Member

@antonfirsov antonfirsov May 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can also be a Func<Meter> on HttpMessageInvoker, so then we don't have to worry about the dependencies, and IMeterFactory (if an interface is needed at all) can live in Microsoft.Extensions.* where the DI-friendly things go usually.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logically IMeterFactory naturally belongs to the Microsoft.extensions more because we'll have another extension method to register this interface to the DI. But technically nothing can stop us having it in DianosticSource library. I am thinking in the future when exposing more stuff like MeterBuilder or anything else, at that time it will look strange having IMeterFactory in one place and other related interfaces in other place.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can also be a Func<Meter> on HttpMessageInvoker

Func<string,Meter> is the signature we'd need for a similar effect as IMeterFactory, but thats no obstacle. My other motivation for IMeterFactory instead of something like Func is so that .NET types are following the same guidance we are planning to give to everyone else. Its not written up in our official docs yet, but the basic pattern is:

  1. Define a parameter on your constructor of type IMeterFactory
  2. Use the factory to create the Meter.

While other patterns are certainly possible (such as passing in a Meter or passing in a Func<string,Meter>), many of them aren't as easy to use with DI. I don't expect folks will be creating HttpMessageInvokers from a DI container so its fair to argue it doesn't matter if we pick a pattern that is DI-friendly. I think the counter-argument is that if the DI-friendly pattern becomes a convention then devs will anticipate that pattern regardless whether the type is created from a DI-container and APIs that don't follow the pattern feel unexpected and awkward. For example most people expect that if your type logs via ILogger then your constructor has an ILogger constructor parameter. Writing a class that has a method SetLogger(ILogger) or a constructor parameter Func<ILogger> functionally would work too, but it looks unusual because it doesn't follow the convention.

Logically IMeterFactory naturally belongs to the Microsoft.extensions

Tarek and I chatted earlier. I agree that it felt intuitive for me as well initially for IMeterFactory to be in the Microsoft.Extensions namespace, but I don't think its the only reasonable association. The general guidance with DI is that the types being created from the DI container should not have any tight coupling to the DI container (for example don't have your constructor use IServiceProvider as input argument). This allows using those types without DI because a caller could always manually instantiate the same arguments. This perspective suggests that any of the types that show up in the constructor parameters, such as IMeterFactory, should not be treated as tightly coupled to DI. If the type isn't coupled to DI the fact that it would land in Microsoft.Extensions when all the other Meter types are in System.Diagnostics didn't feel so intuitive any more.

Ultimately I assume its rare for .NET developers to manually create an HttpMessageInvoker so if folks think its not helpful to follow the IMeterFactory pattern on this one I'll be fine with it. If the pattern is also going to show up on HttpClient I might push a little harder due to more exposure, but still I expect the main usage for injecting a Meter will come from HttpClientFactory internally rather than .NET devs manually injecting one.

Copy link
Member

@antonfirsov antonfirsov May 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ultimately I assume its rare for .NET developers to manually create an HttpMessageInvoker [...]. If the pattern is also going to show up on HttpClient [...]

HttpMessageInvoker is the base class for HttpClient so everything that goes into HttpMessageInvoker goes into HttpClient. I'm treating them as synonyms in this discussion.

Define a parameter on your constructor of type IMeterFactory

The pattern with HttpClient is to consume configuration/dependencies through properties; the only parameter being passed to the constructor is the HttpMessageHandler. It would look very weird if new ctr. overloads would appear taking an IMeterFactory, especially if most users don't need it. (In general, System.Net.Http wasn't designed to be DI-friendly, so making a occasional intrusive decisions following a different design approach would create too much disharmony IMO.)

I expect the main usage for injecting a Meter will come from HttpClientFactory internally rather than .NET devs manually injecting one

In that case, wouldn't it be straightforward to make HttpClientFactory's responsibility to wrap IMeterFactory into a Func<string, Meter> when initializing HttpClient?

If the type isn't coupled to DI the fact that it would land in Microsoft.Extensions when all the other Meter types are in System.Diagnostics didn't feel so intuitive any more.

Even if it's not tightly coupled to DI, the question is if there are any other reasons to make it land in System.Diagnostics other than making DI use-cases easier? Also: are there any other System.* libraries that may end up consuming IMeterFactory?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ultimately I assume its rare for .NET developers to manually create an HttpMessageInvoker so if folks think its not helpful to follow the IMeterFactory pattern on this one I'll be fine with it. If the pattern is also going to show up on HttpClient I might push a little harder due to more exposure, but still I expect the main usage for injecting a Meter will come from HttpClientFactory internally rather than .NET devs manually injecting one.

As @antonfirsov mentioned, HttpMessageInvoker = HttpClient. The client inherits from the invoker. The design needs to work for both.

Manually creating HttpClient is very common. I expect customizing the meter on a client will be much rarer. The primary case for using it will be Microsoft.Extensions.Http.

The way I imagined this hanging together is Microsoft.Extensions.Http has a dependency on Microsoft.Extensions.Metrics. IMeterFactory is injected into HttpClientFactory and sets a meter on the created HttpClient from the factory.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HttpMessageInvoker is the base class for HttpClient

Ah! I didn't catch that relationship. Thanks for the important info.

The pattern with HttpClient is to consume configuration/dependencies through properties

OK. If HttpClient already has that pattern then I guess we have clashing patterns, but it probably makes sense for HttpClient's pattern to take priority.

In that case, wouldn't it be straightforward to make HttpClientFactory's responsibility to wrap IMeterFactory into a Func<string, Meter> when initializing HttpClient

I agree it doesn't take a lot of code to do that wrapping, but assume it was already the plan that IMeterFactory was in System.Diagnostics.DiagnosticSource.dll - would there be any reason to wrap it in the first place?

are there any other System.* libraries that may end up consuming IMeterFactory?

I don't have any concrete plan to point to, but other forms of IO like files or pipes seems like potential candidates for metrics in the System.* namespaces. Its also possible we'd want System.Runtime metrics reported via a factory-produced Meter.

the question is if there are any other reasons to make it land in System.Diagnostics other than making DI use-cases easier

I don't think we need more reasons to meet the bar, but these are the pros/cons as I see it:

  • [Pro]: More libraries will have IMeterFactory accessible to them if they aren't able to take a dependency on Microsoft.Extensions. Even in libraries where there isn't a strict rule against such dependencies, library authors are still likely to prefer fewer dependencies.
  • [Pro]: Assuming we also have IMeterFactory in the System.Diagnostics.Metrics namespace then consumers only need one namespace in their code rather than two.
  • [Con?]: There is the idea that any type used to make the DI scenario easier (IMeterFactory) belongs in M.E.something.dll. However aside from the idea feeling natural, I haven't seen any concrete benefit to organizing it that way in this case.

When I look at the balance scales I see it as small upside and effectively no downside.

{
// TODO: Check that HttpMessageInvoker hasn't been started.
ArgumentNullException.ThrowIfNull(value);
if (value.Name != "System.Net.Http")
{
throw new ArgumentException("Meter name must be 'System.Net.Http'.");
}
_meter = value;
}
}
#pragma warning restore CS3003 // Type is not CLS-compliant

public HttpMessageInvoker(HttpMessageHandler handler)
: this(handler, true)
Expand All @@ -30,16 +53,27 @@ public HttpMessageInvoker(HttpMessageHandler handler, bool disposeHandler)
_disposeHandler = disposeHandler;
}

[MemberNotNull(nameof(_metrics))]
private void EnsureMetrics()
{
_metrics ??= new HttpMetrics(Meter);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to Meter, this should guard against using different instances in a race

}

[UnsupportedOSPlatformAttribute("browser")]
public virtual HttpResponseMessage Send(HttpRequestMessage request, CancellationToken cancellationToken)
{
ArgumentNullException.ThrowIfNull(request);

ObjectDisposedException.ThrowIf(_disposed, this);

EnsureMetrics();

if (ShouldSendWithTelemetry(request))
{
long startTimestamp = Stopwatch.GetTimestamp();

HttpTelemetry.Log.RequestStart(request);
_metrics.RequestStart(request);

HttpResponseMessage? response = null;
try
Expand All @@ -55,6 +89,7 @@ public virtual HttpResponseMessage Send(HttpRequestMessage request, Cancellation
finally
{
HttpTelemetry.Log.RequestStop(response);
_metrics.RequestStop(request, response, startTimestamp, Stopwatch.GetTimestamp());
}
}
else
Expand All @@ -69,16 +104,21 @@ public virtual Task<HttpResponseMessage> SendAsync(HttpRequestMessage request, C

ObjectDisposedException.ThrowIf(_disposed, this);

EnsureMetrics();

if (ShouldSendWithTelemetry(request))
{
return SendAsyncWithTelemetry(_handler, request, cancellationToken);
return SendAsyncWithTelemetry(_handler, request, _metrics, cancellationToken);
}

return _handler.SendAsync(request, cancellationToken);

static async Task<HttpResponseMessage> SendAsyncWithTelemetry(HttpMessageHandler handler, HttpRequestMessage request, CancellationToken cancellationToken)
static async Task<HttpResponseMessage> SendAsyncWithTelemetry(HttpMessageHandler handler, HttpRequestMessage request, HttpMetrics metrics, CancellationToken cancellationToken)
{
long startTimestamp = Stopwatch.GetTimestamp();

HttpTelemetry.Log.RequestStart(request);
metrics.RequestStart(request);
Comment on lines 121 to +122
Copy link
Member

@antonfirsov antonfirsov Apr 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this, we will face all the overhead of both HttpTelemetry and Metrics if any of them is enabled.

I would do a second check for HttpTelemetry.Log.IsEnabled(), and fine-grained checks of <instrument>.Enabled within HttpMetrics methods.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make sure these checks exist in HttpTelemetry as well?
That is, change places like RequestStart/Stop in HttpTelemetry to

Interlocked.Increment(ref _startedRequests);

if (IsEnabled(...))
{
    RequestStart(
        request.RequestUri.Scheme,
        request.RequestUri.IdnHost...);
}

The counter fields should continue to be updated, but we don't have to read the Uri properties/serialize the event if the EventSource isn't enabled.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this logic does not apply to HttpClient.
If you want to make that work, you need to also instrument StartSend and FinishSend in HttpClient.


HttpResponseMessage? response = null;
try
Expand All @@ -94,12 +134,13 @@ static async Task<HttpResponseMessage> SendAsyncWithTelemetry(HttpMessageHandler
finally
{
HttpTelemetry.Log.RequestStop(response);
metrics.RequestStop(request, response, startTimestamp, Stopwatch.GetTimestamp());
}
}
}

private static bool ShouldSendWithTelemetry(HttpRequestMessage request) =>
HttpTelemetry.Log.IsEnabled() &&
private bool ShouldSendWithTelemetry(HttpRequestMessage request) =>
(HttpTelemetry.Log.IsEnabled() || _metrics!.RequestCountersEnabled()) &&
!request.WasSentByHttpClient() &&
request.RequestUri is Uri requestUri &&
requestUri.IsAbsoluteUri;
Expand All @@ -124,7 +165,6 @@ protected virtual void Dispose(bool disposing)
if (disposing && !_disposed)
{
_disposed = true;

if (_disposeHandler)
{
_handler.Dispose();
Expand Down
85 changes: 85 additions & 0 deletions src/libraries/System.Net.Http/src/System/Net/Http/HttpMetrics.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
// Licensed to the .NET Foundation under one or more agreements.
// The .NET Foundation licenses this file to you under the MIT license.

using System.Diagnostics;
using System.Diagnostics.Metrics;

namespace System.Net.Http
{
internal sealed class HttpMetrics
{
private readonly Meter _meter;
private readonly UpDownCounter<long> _currentRequests;
private readonly Histogram<double> _requestsDuration;

public HttpMetrics(Meter meter)
{
_meter = meter;

_currentRequests = _meter.CreateUpDownCounter<long>(
"current-requests",
description: "The duration of outbound HTTP requests.");
JamesNK marked this conversation as resolved.
Show resolved Hide resolved

_requestsDuration = _meter.CreateHistogram<double>(
"request-duration",
unit: "s",
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If unit is specified here then changing the unit will mean we will have no option but to create a new instrument as the requirement for us is to use ms. I know currently the unit here is just an indication but we never know how the backend or some exporter might end up using it for.

Copy link
Member

@noahfalk noahfalk May 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the issue specifically that the unit was specified? For example if the code didn't specify a unit but still captured measurements in seconds would that solve the problem? My understanding is that it would not.

I think emitting the measurement in seconds (and marking the unit that way) is the right thing for .NET to do because it appears to be the overall approach industry at large is standardizing on. However I also get that not all consumers are aligned behind a single standard and in particular we have MS 1P customers who want ms measurements for compat with their existing practice. I believe we need to invest in techniques that let us present different views of the data to different customer segments. A couple ideas how that might happen:

  1. Creating a new instrument with different naming that proxies a scaled version of the measurements and only exporting data from the new instrument.
  2. Ignoring this instrument and continuing to collect a separate measurement via a different handler or DiagnosticSource
  3. Doing work in OTel support changing the units on an existing instrument before it is reported, perhaps as part of a View
  4. Doing a similar feature as (3) but in the runtime further building on the config and pipeline work

(1) or (2) seem like they might be our shortest-term options but (3) and (4) are probably where I'd hope we end up eventually.

description: "Number of outbound HTTP requests that are currently active on the client.");
}

public void RequestStart(HttpRequestMessage request)
{
#pragma warning disable SA1129 // Do not use default value type constructor
var tags = new TagList();
#pragma warning restore SA1129 // Do not use default value type constructor
Comment on lines +39 to +41
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#pragma warning disable SA1129 // Do not use default value type constructor
var tags = new TagList();
#pragma warning restore SA1129 // Do not use default value type constructor
TagList tags = default;

Does this work to avoid the warning?

InitializeCommonTags(ref tags, request);
_currentRequests.Add(1, tags);
}

public void RequestStop(HttpRequestMessage request, HttpResponseMessage? response, long startTimestamp, long currentTimestamp)
{
#pragma warning disable SA1129 // Do not use default value type constructor
var tags = new TagList();
#pragma warning restore SA1129 // Do not use default value type constructor
InitializeCommonTags(ref tags, request);
Copy link
Member

@antonfirsov antonfirsov Apr 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because of redirects you may log different tags here than at RequestStart, we need to snapshot the initial values.

Copy link
Member Author

@JamesNK JamesNK Apr 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Properties on HttpRequestMessage are updated with a response redirect?

In that case, what request values should be used?

Options:

  1. Don't use the initial request values anywhere. That means we don't include tags on current-requests and have a dimensionless counter. The request-duration counter uses the HttpRequestMessage state based on when SendAsync returns.
  2. Use initial request values with current-requests and request-duration. Ignore request message state after SendAsync returns. Maybe have an extra counter to check if the URL changes between sending the request and receiving the response to record redirect counts.
  3. A mix. Initial values on current-requests and potentially updated values on request-duration.

No perfect choice. I like option 2, but curious about what client networking experts think and what current diagnostics does.

Copy link
Member

@antonfirsov antonfirsov May 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Properties on HttpRequestMessage are updated with a response redirect?

Yes, the request is being reused to save allocations, but I think this is a fundamental problem, regardless of our implementation.

what current diagnostics does.

I don't think this concern exists in case of existing counters. The only ones where we have something similar to tags are the <protocol>-connections-current-total and the <protocol>-requests-queue-duration counters. In both cases the protocol version is clear at the moment we increment the counter.

Use initial request values with current-requests and request-duration. Ignore request message state after SendAsync returns.

Wouldn't this be unexpected to users? For example by using HttpVersionPolicy.RequestVersionOrLower/RequestVersionOrHigher and negotiating lower/higher versions, a user would not log the actual HTTP versions being used.

No perfect choice.

Isn't it also an option to record multiple requests on redirects? (Even though it means that the implementation would be bound to SocketsHttpHandler and other handlers would need to implement the metrics separately.)

I have opened open-telemetry/opentelemetry-specification#3462, hoping that OTel folks can help clarifying this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this concern exists in case of existing counters.

Yeah existing counters don't have request-specific info on them, but I was thinking about what is in logs and the activity.

Copy link
Member

@MihaZupan MihaZupan May 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with Anton that relying on the request message's version can be quite misleading.
YARP for example uses a default version of 2.0 with a RequestVersionOrLower policy.
If we only use the request version, the user would only see 2.0 counters even if they're actually only talking 1.1 to the server.

The unfortunate thing is that we'll only know what version is actually used when we try sending the request to the wire internally, or from a higher-level perspective, once you get back the response message object.

It may make sense to consider moving this logic further down into SocketsHttpHandler, especially if you will also want to track when the response content has been read completely.

Copy link
Member Author

@JamesNK JamesNK May 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the version isn't known when a request starts, it's not included as a tag on current-requests. And request-duration uses HttpResponseMessage.Version for this information. This PR already has that behavior.

_currentRequests.Add(-1, tags);

if (response != null)
{
tags.Add("status-code", (int)response.StatusCode); // Boxing?
Copy link
Member

@antonfirsov antonfirsov Apr 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TagList looks like a boxing factory. I wounder if this concern was considered during the design of #56936. If it's concerning enough we may consider caching the objects, but that would come with its' own overhead.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One option is having a cache of boxed commonly used status codes.

tags.Add("protocol", $"HTTP/{response.Version}"); // Hacky
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hacky because it relies on formatting implementation? Shall we do something like this to avoid the formatting allocation?

static string GetProtocolName6(Version httpVersion) => (httpVersion.Major, httpVersion.Minor) switch
{
    (1, 1) => "HTTP/1.1",
    (2, 0) => "HTTP/2.0",
    (3, 0) => "HTTP/3.0",
    _ => "unknown"
};

Copy link
Member Author

@JamesNK JamesNK Apr 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hacky because it allocates, some values aren't right*, and it's simple to improve such as your snippet.

*HTTP/2 would be better than HTTP/2.0.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should avoid allocation. We wouldn't want our services to pay the allocation cost for something which our library will anyways drop and not emit. Protocol is one of them

}
if (request.HasTags)
{
foreach (var customTag in request.MetricsTags)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to HttpRequest properties not being stable, these tags might also not be stable across the duration of the request. Your EnrichmentHandler test is an example that deliberately changes them.

I think one scenario we need to sort out is the case where the caller to HttpClient is some 3rd party library that isn't specifying tags but the app developer does want to get all their traffic classified. Currently R9 is handling this case by having the app developer create a mapping from outbound URIs -> other metadata, then they register a handler in the HttpClientFactory to execute that mapping and generate the tagged telemetry. If R9 is going to switch over to use these built-in metrics rather than continuing to add their own then they will need some way to dynamically inject those tags. If the injection mechanism will continue to be a handler (similar to your EnrichmentHandler), then perhaps we need to move the metric collection point back to the DiagnosticsHandler where it is now so that other handlers have a chance to update those tags before they are captured. (I know I suggested doing it at the front earlier, I'm not sure my idea was a good one now that I consider this scenario)

@dpk83 - you probably want to take a look at this if you haven't already

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another question, will the same metrics support be added for net framework or is it limited to net core only?

Copy link
Member Author

@JamesNK JamesNK May 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the concern here. R9 has a DelegatingHandler that is responsible for recording to the metric. The handler also executes enrichers to build up a collection of custom tags.

The difference in this prototype is recording to the metric and gathering custom tags happens in different places. The handler writes them onto the HttpRequestMessage, and they are read off the message later when needed.

My main concern with this approach is making sure there is the opportunity for an enrichment handler to add tags after SendAsync has returned. Right now the metrics are recorded in the HttpMessageInvoker. That means the code after SendAsync in each delegating handler runs before the counter is recorded. Metrics at this layer works, but it doesn't record separate HTTP requests if a Polly handler is added for retries. The solution is to move metrics to DiaganosticsHandler. That handler is always inside a retry handler, so each HTTP request is recorded, but we lose the ability to do enrichment after SendAsync returns. DiaganosticsHandler is added after any user-injected handlers (such as enrichment), so metrics are recorded before code after SendAsync runs.

A solution that supports both goals could be to wait to record the HTTP duration counter value when an HTTP request has completed both of these actions:

  1. The HttpRequestMessage exits the HTTP pipeline
  2. The HttpResponseMessage finishes reading the response

It would be slightly odd that DiaganosticsHandler "starts" metrics for an HTTP request (e.g. starts the request duration timer, increments current request count), and exiting HttpMessageInvoker "finishes" it (e.g. records request duration with enriched tags and decreases current request count).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for confusion, upon re-reading the code I see that I misunderstood it earlier and my comment was based on that misunderstanding. I was incorrectly worried that custom tags would cause the current-requests increment operation to be recorded with different tags than the decrement which would unbalance it. Now I see that custom tags are only used for the duration measurement and my concern was unfounded.

While I do still care about the scenario I described above with 3rd party libraries, I think this design would be able to handle it.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few points:

  • Quite often caller doesn't have access to the HttpRequestMessage, third party library is a very good example of this scenario. In complex organizations where there are hundreds and thousands of services a common libraries are used quite a lot and this option just doesn't work. We devised IDownstreamDependencyMetadata for that very purpose.

  • We are now supporting both DelegatingHandler based auto collection of metrics as well as DiagnosticsSource based approach because DelegatingHandler wasn't sufficient for a lot of cases.

  • In order to consume this metrics directly emitted from .NET and provide the transformation and the dependency metadata looks like we will still continue to use DelegatingHandler and DiagnosticsSource so I don't see how we can take consume and build onto the metrics this component is going to emit without incurring additional perf overhead.

  • Is this support is going to be ported back for .NET Framework?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The solution is to move metrics to DiaganosticsHandler

Note that DiagnosticsHandler is now strongly coupled with distributed tracing, and disabling distributed tracing either by by setting setting DOTNET_SYSTEM_NET_HTTP_ENABLEACTIVITYPROPAGATION=0 or by setting socketsHttpHandler.ActivityHeadersPropagator = null would completely disable it.

Wouldn't it be better to implement metrics in a separate MetricsHandler that would live either before or after DiagnosticsHandler in the chain?

Copy link
Member Author

@JamesNK JamesNK May 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't know about that switch. Good to know.

If a separate handler is better, then do it. The difference we're considering is the same: counters at the HttpMessageInvoker layer vs counters at an inner handler.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see how we can take consume and build onto the metrics this component is going to emit without incurring additional perf overhead.

What perf overhead are you worried about @dpk83?

While I don't think R9 strictly has to use this counter for correct behavior, from the consumer's perspective I'd really hope to describe the R9 stuff to 3P devs as an enhancement and not another orthogonal telemetry mechanism. I am leaving out here 1P back-compat where we probably want a different schema. For example I don't think it would be good if 3P devs see duplicate measurements. That would mean that R9 either needs to add tags to this instrument or suppress the instrument and replicate it, including the new enrichment tags. Making sure an alternate implementation stays in-sync feels possible, but error-prone and more work intensive.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another question, will the same metrics support be added for net framework or is it limited to net core only?

I think this got answered during our meeting earlier, but reproducing it here - the new metrics would only be present in .NET 8 onwards. This code doesn't flow downlevel either to framework or to older versions of .NET Core.

{
tags.Add(customTag);
}
}
var duration = Stopwatch.GetElapsedTime(startTimestamp, currentTimestamp);
_requestsDuration.Record(duration.TotalSeconds, tags);
JamesNK marked this conversation as resolved.
Show resolved Hide resolved
}

private static void InitializeCommonTags(ref TagList tags, HttpRequestMessage request)
{
if (request.RequestUri is { } requestUri && requestUri.IsAbsoluteUri)
{
if (requestUri.Scheme is not null)
{
tags.Add("scheme", requestUri.Scheme);
}
if (requestUri.Host is not null)
{
tags.Add("host", requestUri.Host);
}
Comment on lines +88 to +95
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (requestUri.Scheme is not null)
{
tags.Add("scheme", requestUri.Scheme);
}
if (requestUri.Host is not null)
{
tags.Add("host", requestUri.Host);
}
tags.Add("scheme", requestUri.Scheme);
tags.Add("host", requestUri.HostNameType == UriHostNameType.IPv6 ? $"[{requestUri.IdnHost}]" : requestUri.IdnHost);

Do you want this to be the IP even if we know the host (the header is present)?

Copy link
Member Author

@JamesNK JamesNK May 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know. OpenTelemetry has the host tag - called net.peer.name - in its semantic conventions. We should check with them for recommended behavior.

https://opentelemetry.io/docs/reference/specification/metrics/semantic_conventions/http-metrics/#metric-httpclientduration

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is your thought around using IdnHost over Host? I'm not well versed on the topic but from what I read in the docs the difference is that IdnHost escapes some characters with Punycode. I would expect handlers of Metric data to respect the full unicode character set and the strings are likely to get exposed to humans via dashboards and queries where escape sequences are more awkward so Host seemed more appropriate. Looking at R9 they've been using Host as the tag value so we've got some real world validation that Host satisfies the goal.

Do you want this to be the IP?

I think want the name whenever it has been provided (which I assume is what Host does).

@dpk83 - You may have considered this in the past, any thoughts on this one?

// Add port tag when not the default value for the current scheme
if (!requestUri.IsDefaultPort)
{
tags.Add("port", requestUri.Port);
}
}
tags.Add("method", request.Method.Method);
}

internal bool RequestCountersEnabled() => _currentRequests.Enabled || _requestsDuration.Enabled;
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@ public class HttpRequestMessage : IDisposable
private HttpVersionPolicy _versionPolicy;
private HttpContent? _content;
private HttpRequestOptions? _options;
private List<KeyValuePair<string, object?>>? _metricsTags;

public Version Version
{
Expand Down Expand Up @@ -116,6 +117,14 @@ public Uri? RequestUri
/// </summary>
public HttpRequestOptions Options => _options ??= new HttpRequestOptions();

// TODO: Should tags be defined in a new collection or stashed in HttpRequestOptions?
// If tags are stashed in HttpRequestOptions then there should probably be extension methods to interact with them.
// If tags are stored in a collection then a custom collection type might be wanted.
// An Add(string name, object? value) method for convenience would be nice.
public ICollection<KeyValuePair<string, object?>> MetricsTags => _metricsTags ??= new List<KeyValuePair<string, object?>>();

internal bool HasTags => _metricsTags != null;

public HttpRequestMessage()
: this(HttpMethod.Get, (Uri?)null)
{
Expand Down
Loading