Skip to content

Commit

Permalink
Split timeseries by data types in Metrics Protobuf definitions (#34)
Browse files Browse the repository at this point in the history
* Split timeseries by data types in Metrics Protobuf definitions

Previously we had Point message which was a value of oneof the data types. This is
unnecessary flexibility because points in the same timeseries cannot be of different
data type. This also costed performance.

Now we have separate timeseries message definitions for each data type and the
timeseries used is defined by the oneof entry in the Metric message.

This change is stacked on top of #33

Simple benchmark in Go demonstrates the following improvement of encoding and decoding
compared to the baseline state:

```
===== Encoded sizes
Encoding                       Uncompressed  Improved        Compressed  Improved
Baseline/MetricOne              24200 bytes  [1.000], gziped 1804 bytes  [1.000]
Proposed/MetricOne              19400 bytes  [1.247], gziped 1626 bytes  [1.109]

Encoding                       Uncompressed  Improved        Compressed  Improved
Baseline/MetricSeries           56022 bytes  [1.000], gziped 6655 bytes  [1.000]
Proposed/MetricSeries           43415 bytes  [1.290], gziped 6422 bytes  [1.036]

goos: darwin
goarch: amd64
pkg: github.com/tigrannajaryan/exp-otelproto/encodings
BenchmarkEncode/Baseline/MetricOne-8         	      27	 207923054 ns/op
BenchmarkEncode/Proposed/MetricOne-8         	      44	 133984867 ns/op

BenchmarkEncode/Baseline/MetricSeries-8      	       8	 649581262 ns/op
BenchmarkEncode/Proposed/MetricSeries-8      	      18	 324559562 ns/op

BenchmarkDecode/Baseline/MetricOne-8         	      15	 379468217 ns/op	186296043 B/op	 5274000 allocs/op
BenchmarkDecode/Proposed/MetricOne-8         	      21	 278470120 ns/op	155896034 B/op	 4474000 allocs/op

BenchmarkDecode/Baseline/MetricSeries-8      	       5	1041719362 ns/op	455096051 B/op	12174000 allocs/op
BenchmarkDecode/Proposed/MetricSeries-8      	       9	 603392754 ns/op	338296035 B/op	 8574000 allocs/op
```

It is 30-50% faster and is 20-25% smaller on the wire and in memory.

Benchmarks encode and decode 500 batches of 2 metrics: one int64 Gauge with 5 time series
and one Histogram of doubles with 1 time series and single bucket. Each time series for
both metrics contains either 1 data point (MetricOne) or 5 data points (MetricSeries).
Both metrics have 2 labels.

Benchmark source code is available at:
https://github.com/tigrannajaryan/exp-otelproto/blob/master/encodings/encoding_test.go

* Eliminate *TimeSeriesList messages
  • Loading branch information
tigrannajaryan authored and SergeyKanzhelev committed Oct 31, 2019
1 parent 608c358 commit 6d70feb
Showing 1 changed file with 121 additions and 32 deletions.
153 changes: 121 additions & 32 deletions opentelemetry/proto/metrics/v1/metrics.proto
Original file line number Diff line number Diff line change
Expand Up @@ -28,13 +28,17 @@ message Metric {
// The descriptor of the Metric.
MetricDescriptor metric_descriptor = 1;

// One or more timeseries for a single metric, where each timeseries has
// one or more points.
repeated TimeSeries timeseries = 2;

// The resource for the metric. If unset, it may be set to a default value
// provided for a sequence of messages in an RPC stream.
opentelemetry.proto.resource.v1.Resource resource = 3;
opentelemetry.proto.resource.v1.Resource resource = 2;

// data is a list of one or more TimeSeries for a single metric, where each timeseries has
// one or more points. Only one of the following fields is used for the data, depending on
// the type of the metric defined by MetricDescriptor.type field.
repeated Int64TimeSeries int64_timeseries = 3;
repeated DoubleTimeSeries double_timeseries = 4;
repeated HistogramTimeSeries histogram_timeseries = 5;
repeated SummaryTimeSeries summary_timeseries = 6;
}

// Defines a metric type and its schema.
Expand Down Expand Up @@ -100,17 +104,52 @@ message MetricDescriptor {
repeated string label_keys = 5;
}

// A collection of data points that describes the time-varying values
// of a metric.
message TimeSeries {
// Int64TimeSeries is a list of data points that describes the time-varying values
// of a int64 metric.
message Int64TimeSeries {
// The set of label values that uniquely identify this timeseries. Applies to
// all points. The order of label values must match that of label keys in the
// metric descriptor.
repeated LabelValue label_values = 1;

// The data points of this timeseries.
repeated Int64Value points = 2;
}

// eDoubleTimeSeries is a list of data points that describes the time-varying values
// of a double metric.
message DoubleTimeSeries {
// The set of label values that uniquely identify this timeseries. Applies to
// all points. The order of label values must match that of label keys in the
// metric descriptor.
repeated LabelValue label_values = 1;

// The data points of this timeseries.
repeated DoubleValue points = 2;
}

// HistogramTimeSeries is a list of data points that describes the time-varying values
// of a Histogram.
message HistogramTimeSeries {
// The set of label values that uniquely identify this timeseries. Applies to
// all points. The order of label values must match that of label keys in the
// metric descriptor.
repeated LabelValue label_values = 1;

// The data points of this timeseries.
repeated HistogramValue points = 2;
}

// SummaryTimeSeries is a list of data points that describes the time-varying values
// of a Summary metric.
message SummaryTimeSeries {
// The set of label values that uniquely identify this timeseries. Applies to
// all points. The order of label values must match that of label keys in the
// metric descriptor.
repeated LabelValue label_values = 1;

// The data points of this timeseries. Point.value type MUST match the
// MetricDescriptor.type.
repeated Point points = 2;
// The data points of this timeseries.
repeated SummaryValue points = 2;
}

message LabelValue {
Expand All @@ -121,8 +160,8 @@ message LabelValue {
bool has_value = 2;
}

// A timestamped measurement.
message Point {
// Int64Value is a timestamped measurement of int64 value.
message Int64Value {
// start_time_unixnano is the time when the cumulative value was reset to zero.
// This is used for Counter type only. For Gauge the value is not specified and
// defaults to 0.
Expand All @@ -142,33 +181,65 @@ message Point {
// Value is UNIX Epoch time in nanoseconds since 00:00:00 UTC on 1 January 1970.
sfixed64 timestamp_unixnano = 2;

// The actual point value.
oneof value {
// A 64-bit integer.
int64 int64_value = 3;
// value itself.
int64 value = 3;
}

// A 64-bit double-precision floating-point number.
double double_value = 4;
// DoubleValue is a timestamped measurement of double value.
message DoubleValue {
// start_time_unixnano is the time when the cumulative value was reset to zero.
// This is used for Counter type only. For Gauge the value is not specified and
// defaults to 0.
//
// The cumulative value is over the time interval (start_time_unixnano, timestamp_unixnano].
// Value is UNIX Epoch time in nanoseconds since 00:00:00 UTC on 1 January 1970.
//
// Value of 0 indicates that the start_time is the same as that of the previous
// data point in this timeseries. When creating timeseries of this type it is recommended
// to omit this value if the start_time does not change, since it results in more
// compact encoding on the wire.
// If the value of 0 occurs for the first data point in the timeseries it means that
// the timestamp is unspecified. In that case the timestamp may be decided by the backend.
sfixed64 start_time_unixnano = 1;

// A histogram value.
HistogramValue histogram_value = 5;
// timestamp_unixnano is the moment when this value was recorded.
// Value is UNIX Epoch time in nanoseconds since 00:00:00 UTC on 1 January 1970.
sfixed64 timestamp_unixnano = 2;

// A summary value. This is not recommended, since it cannot be aggregated.
SummaryValue summary_value = 6;
}
// value itself.
double value = 3;
}

// Histogram contains summary statistics for a population of values. It may
// optionally contain the distribution of those values across a set of buckets.
message HistogramValue {
// start_time_unixnano is the time when the cumulative value was reset to zero.
// This is used for Counter type only. For Gauge the value is not specified and
// defaults to 0.
//
// The cumulative value is over the time interval (start_time_unixnano, timestamp_unixnano].
// Value is UNIX Epoch time in nanoseconds since 00:00:00 UTC on 1 January 1970.
//
// Value of 0 indicates that the start_time is the same as that of the previous
// data point in this timeseries. When creating timeseries of this type it is recommended
// to omit this value if the start_time does not change, since it results in more
// compact encoding on the wire.
// If the value of 0 occurs for the first data point in the timeseries it means that
// the timestamp is unspecified. In that case the timestamp may be decided by the backend.
sfixed64 start_time_unixnano = 1;

// timestamp_unixnano is the moment when this value was recorded.
// Value is UNIX Epoch time in nanoseconds since 00:00:00 UTC on 1 January 1970.
sfixed64 timestamp_unixnano = 2;

// The number of values in the population. Must be non-negative. This value
// must equal the sum of the values in bucket_counts if a histogram is
// provided.
int64 count = 1;
int64 count = 3;

// The sum of the values in the population. If count is zero then this field
// must be zero.
double sum = 2;
double sum = 4;

// A Histogram may optionally contain the distribution of the values in the
// population. The bucket boundaries are described by BucketOptions.
Expand Down Expand Up @@ -197,7 +268,7 @@ message HistogramValue {

// Don't change bucket boundaries within a TimeSeries if your backend doesn't
// support this.
BucketOptions bucket_options = 3;
BucketOptions bucket_options = 5;

message Bucket {
// The number of values in each bucket of the histogram, as described in
Expand Down Expand Up @@ -227,19 +298,38 @@ message HistogramValue {

// The sum of the values in the Bucket counts must equal the value in the
// count field of the histogram.
repeated Bucket buckets = 4;
repeated Bucket buckets = 6;
}

// The start_timestamp only applies to the count and sum in the SummaryValue.
message SummaryValue {
// start_time_unixnano is the time when the cumulative value was reset to zero.
// This is used for Counter type only. For Gauge the value is not specified and
// defaults to 0.
//
// The cumulative value is over the time interval (start_time_unixnano, timestamp_unixnano].
// Value is UNIX Epoch time in nanoseconds since 00:00:00 UTC on 1 January 1970.
//
// Value of 0 indicates that the start_time is the same as that of the previous
// data point in this timeseries. When creating timeseries of this type it is recommended
// to omit this value if the start_time does not change, since it results in more
// compact encoding on the wire.
// If the value of 0 occurs for the first data point in the timeseries it means that
// the timestamp is unspecified. In that case the timestamp may be decided by the backend.
sfixed64 start_time_unixnano = 1;

// timestamp_unixnano is the moment when this value was recorded.
// Value is UNIX Epoch time in nanoseconds since 00:00:00 UTC on 1 January 1970.
sfixed64 timestamp_unixnano = 2;

// The total number of recorded values since start_time. Optional since
// some systems don't expose this.
google.protobuf.Int64Value count = 1;
google.protobuf.Int64Value count = 3;

// The total sum of recorded values since start_time. Optional since some
// systems don't expose this. If count is zero then this field must be zero.
// This field must be unset if the sum is not available.
google.protobuf.DoubleValue sum = 2;
google.protobuf.DoubleValue sum = 4;

// The values in this message can be reset at arbitrary unknown times, with
// the requirement that all of them are reset at the same time.
Expand Down Expand Up @@ -269,6 +359,5 @@ message SummaryValue {
}

// Values calculated over an arbitrary time window.
Snapshot snapshot = 3;
Snapshot snapshot = 5;
}

0 comments on commit 6d70feb

Please sign in to comment.