Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Histogram Record Method Benchmark #2754

Merged
merged 13 commits into from
Feb 4, 2022
5 changes: 4 additions & 1 deletion src/OpenTelemetry/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,13 +2,16 @@

## Unreleased

* More efficient histogram bucket finding algorithm.
([#2754](https://github.com/open-telemetry/opentelemetry-dotnet/pull/2754))

## 1.2.0-rc1

Released 2021-Nov-29

* Prevent accessing activity Id before sampler runs in case of legacy
activities.
([2659](https://github.com/open-telemetry/opentelemetry-dotnet/pull/2659))
([#2659](https://github.com/open-telemetry/opentelemetry-dotnet/pull/2659))

* Added `ReadOnlyTagCollection` and expose `Tags` on `MetricPoint` instead of
`Keys`+`Values`
Expand Down
14 changes: 10 additions & 4 deletions src/OpenTelemetry/Metrics/MetricPoint.cs
Original file line number Diff line number Diff line change
Expand Up @@ -247,12 +247,18 @@ internal void Update(double number)
case AggregationType.Histogram:
{
int i;
for (i = 0; i < this.histogramBuckets.ExplicitBounds.Length; i++)
if (double.IsNaN(number))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting... 🤔 makes me wonder if NaN should be an error case that we just log instead of recording.

Copy link
Contributor Author

@mic-max mic-max Dec 28, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ya, I'm not sure. I was trying to fix the Prometheus exporter test for PrometheusSerializerTests.HistogramNaN since I was trying to compare by using the == operator and also double.NaN returns False for it appears all uses of <, >, ==, etc.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, doesn't make sense to me that we should increment the largest bucket... why not the smallest? or even a random one?

Interestingly, I checked the Java SDK, and it turns out they too increment the last bucket. Wonder if this is just accidental because I don't see anything in the spec.

Though I looked at that prometheus test and noted that NaN is a valid value in the prometheus format.

Maybe something @reyang could shed some light on.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The SDK Specification gives certain flexibility to the SDK implementations for perf consideration.

Prometheus Exporter spec (which is still Experimental) hasn't covered NaN histogram yet, further clarifications are needed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If perf is what drives our decision then it would make sense to try and eliminate the if (double.IsNaN(number)) check.

If we settle on using Array.BinarySearch then ~Array.BinarySearch(buckets, double.NaN) equals 0. Seems it would be slightly more efficient to increment the first bucket in this case.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though, the bummer of aggregating a NaN value is that it not only messes up one of the bucket counts, it also clobbers the sum.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If perf is what drives our decision then it would make sense to try and eliminate the if (double.IsNaN(number)) check.

+1

This is also called out in the spec:

It is unspecified how the SDK should handle the input limits. The SDK authors MAY leverage/follow the language runtime behavior for better performance, rather than perform a check on each value coming from the API.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though, the bummer of aggregating a NaN value is that it not only messes up one of the bucket counts, it also clobbers the sum.

In addition to NaN, other values could result in NaN as well, for example (note that currently histogram does not support negative value):

histogram.Record(double.PositiveInfinity);
histogram.Record(double.NegativeInfinity);

{
// Upper bound is inclusive
if (number <= this.histogramBuckets.ExplicitBounds[i])
i = this.histogramBuckets.ExplicitBounds.Length;
}
else
{
// If number is not a histogram bound, then the result will be zero
mic-max marked this conversation as resolved.
Show resolved Hide resolved
// The bitwise complement of the returned value is the insertion index
i = Array.BinarySearch(this.histogramBuckets.ExplicitBounds, number);
Copy link
Member

@reyang reyang Jan 3, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For histogram with small number of bounds, this extra invocation could turn out to be slower.

We need to have a perf benchmark to show the numbers (especially for low buckets).

Refer to the prom-client and see how low/high buckets are handled differently:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll write a self-contained benchmark that compares the existing linear search implementation with the binary search one and run it with a range of boundary counts.

I agree that linear is likely faster for a small number of boundaries, but eventually the binary search will outperform it. Then I can a boundary number where anything less uses linear, and anything more uses binary.

I'll generate the random numbers so that each index gets an equal amount of finds.
Ex: If the sorted array is [ 20, 30, 40 ] then the random numbers will be generated like random.next(10, 50)
An array of all the results from performing 10,000 checks should looks similar to:

Value Count
0 2500
1 2500
2 2500
3 2500

if (i < 0)
{
break;
i = ~i;
}
}

Expand Down
8 changes: 6 additions & 2 deletions test/OpenTelemetry.Tests.Stress.Metrics/Program.cs
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@ public partial class Program
private const int ArraySize = 10;
private static readonly Meter TestMeter = new Meter(Utils.GetCurrentMethodName());
private static readonly Counter<long> TestCounter = TestMeter.CreateCounter<long>("TestCounter");
private static readonly Histogram<double> TestHistogram = TestMeter.CreateHistogram<double>("TestHistogram");
private static readonly string[] DimensionValues = new string[ArraySize];
private static readonly ThreadLocal<Random> ThreadLocalRandom = new ThreadLocal<Random>(() => new Random());

Expand All @@ -37,6 +38,7 @@ public static void Main()
DimensionValues[i] = $"DimValue{i}";
}

var boundaries = new double[] { 0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 210, 220, 230, 240, 250, 260, 270, 280, 290, 300, 310, 320, 330, 340, 350, 360, 370, 380, 390, 400, 410, 420, 430, 440, 450, 460, 470, 480, 490, 500 };
using var meterProvider = Sdk.CreateMeterProviderBuilder()
.AddMeter(TestMeter.Name)
.AddPrometheusExporter(options =>
Expand All @@ -45,6 +47,7 @@ public static void Main()
options.HttpListenerPrefixes = new string[] { $"http://localhost:9185/" };
options.ScrapeResponseCacheDurationMilliseconds = 0;
})
.AddView(TestHistogram.Name, new ExplicitBucketHistogramConfiguration() { Boundaries = boundaries })
.Build();

Stress(prometheusPort: 9184);
Expand All @@ -54,10 +57,11 @@ public static void Main()
protected static void Run()
{
var random = ThreadLocalRandom.Value;
TestCounter.Add(
/*TestCounter.Add(
100,
new("DimName1", DimensionValues[random.Next(0, ArraySize)]),
new("DimName2", DimensionValues[random.Next(0, ArraySize)]),
new("DimName3", DimensionValues[random.Next(0, ArraySize)]));
new("DimName3", DimensionValues[random.Next(0, ArraySize)]));*/
TestHistogram.Record(random.Next(-100, 600));
}
}