Skip to content

Commit

Permalink
Split timeseries by data types in Metrics Protobuf definitions
Browse files Browse the repository at this point in the history
Previously we had Point message which was a value of oneof the data types. This is
unnecessary flexibility because points in the same timeseries cannot be of different
data type. This also costed performance.

Now we have separate timeseries message definitions for each data type and the
timeseries used is defined by the oneof entry in the Metric message.

This change is stacked on top of open-telemetry#33

Simple benchmark in Go demonstrates the following improvement of encoding and decoding
compared to the baseline state:

```
===== Encoded sizes
Encoding                       Uncompressed  Improved        Compressed  Improved
Baseline/MetricOne              24200 bytes  [1.000], gziped 1804 bytes  [1.000]
Proposed/MetricOne              19400 bytes  [1.247], gziped 1626 bytes  [1.109]

Encoding                       Uncompressed  Improved        Compressed  Improved
Baseline/MetricSeries           56022 bytes  [1.000], gziped 6655 bytes  [1.000]
Proposed/MetricSeries           43415 bytes  [1.290], gziped 6422 bytes  [1.036]

goos: darwin
goarch: amd64
pkg: github.com/tigrannajaryan/exp-otelproto/encodings
BenchmarkEncode/Baseline/MetricOne-8         	      27	 207923054 ns/op
BenchmarkEncode/Proposed/MetricOne-8         	      44	 133984867 ns/op

BenchmarkEncode/Baseline/MetricSeries-8      	       8	 649581262 ns/op
BenchmarkEncode/Proposed/MetricSeries-8      	      18	 324559562 ns/op

BenchmarkDecode/Baseline/MetricOne-8         	      15	 379468217 ns/op	186296043 B/op	 5274000 allocs/op
BenchmarkDecode/Proposed/MetricOne-8         	      21	 278470120 ns/op	155896034 B/op	 4474000 allocs/op

BenchmarkDecode/Baseline/MetricSeries-8      	       5	1041719362 ns/op	455096051 B/op	12174000 allocs/op
BenchmarkDecode/Proposed/MetricSeries-8      	       9	 603392754 ns/op	338296035 B/op	 8574000 allocs/op
```

It is 30-50% faster and is 20-25% smaller on the wire and in memory.

Benchmarks encode and decode 500 batches of 2 metrics: one int64 Gauge with 5 time series
and one Histogram of doubles with 1 time series and single bucket. Each time series for
both metrics contains either 1 data point (MetricOne) or 5 data points (MetricSeries).
Both metrics have 2 labels.

Benchmark source code is available at:
https://github.com/tigrannajaryan/exp-otelproto/blob/master/encodings/encoding_test.go
  • Loading branch information
Tigran Najaryan committed Oct 30, 2019
1 parent fdc331a commit 6769b93
Showing 1 changed file with 144 additions and 32 deletions.
176 changes: 144 additions & 32 deletions opentelemetry/proto/metrics/v1/metrics.proto
Original file line number Diff line number Diff line change
Expand Up @@ -28,13 +28,19 @@ message Metric {
// The descriptor of the Metric.
MetricDescriptor metric_descriptor = 1;

// One or more timeseries for a single metric, where each timeseries has
// one or more points.
repeated TimeSeries timeseries = 2;

// The resource for the metric. If unset, it may be set to a default value
// provided for a sequence of messages in an RPC stream.
opentelemetry.proto.resource.v1.Resource resource = 3;
opentelemetry.proto.resource.v1.Resource resource = 2;

// data is a list of one or more TimeSeries for a single metric, where each timeseries has
// one or more points. Only one of the following fields is used for the data, depending on
// the type of the metric defined by MetricDescriptor.type field.
oneof data {
Int64TimeSeriesList int64_data = 3;
DoubleTimeSeriesList double_data = 4;
HistogramTimeSeriesList histogram_data = 7;
SummaryTimeSeriesList summary_data = 8;
}
}

// Defines a metric type and its schema.
Expand Down Expand Up @@ -100,17 +106,73 @@ message MetricDescriptor {
repeated string label_keys = 5;
}

// A collection of data points that describes the time-varying values
// of a metric.
message TimeSeries {
// Int64TimeSeriesList is a list of timeseries of int64 values.
message Int64TimeSeriesList {
repeated Int64TimeSeries list = 1;
}

// DoubleTimeSeriesList is a list of timeseries of double values.
message DoubleTimeSeriesList {
repeated DoubleTimeSeries list = 1;
}


// HistogramTimeSeriesList is a list of timeseries of Histogram.
message HistogramTimeSeriesList {
repeated HistogramTimeSeries list = 1;
}

// SummaryTimeSeriesList is a list of timeseries of Summary.
message SummaryTimeSeriesList {
repeated SummaryTimeSeries list = 1;
}

// Int64TimeSeries is a list of data points that describes the time-varying values
// of a int64 metric.
message Int64TimeSeries {
// The set of label values that uniquely identify this timeseries. Applies to
// all points. The order of label values must match that of label keys in the
// metric descriptor.
repeated LabelValue label_values = 1;

// The data points of this timeseries. Point.value type MUST match the
// MetricDescriptor.type.
repeated Point points = 2;
// The data points of this timeseries.
repeated Int64Value points = 2;
}

// eDoubleTimeSeries is a list of data points that describes the time-varying values
// of a double metric.
message DoubleTimeSeries {
// The set of label values that uniquely identify this timeseries. Applies to
// all points. The order of label values must match that of label keys in the
// metric descriptor.
repeated LabelValue label_values = 1;

// The data points of this timeseries.
repeated DoubleValue points = 2;
}

// HistogramTimeSeries is a list of data points that describes the time-varying values
// of a Histogram.
message HistogramTimeSeries {
// The set of label values that uniquely identify this timeseries. Applies to
// all points. The order of label values must match that of label keys in the
// metric descriptor.
repeated LabelValue label_values = 1;

// The data points of this timeseries.
repeated HistogramValue points = 2;
}

// SummaryTimeSeries is a list of data points that describes the time-varying values
// of a Summary metric.
message SummaryTimeSeries {
// The set of label values that uniquely identify this timeseries. Applies to
// all points. The order of label values must match that of label keys in the
// metric descriptor.
repeated LabelValue label_values = 1;

// The data points of this timeseries.
repeated SummaryValue points = 2;
}

message LabelValue {
Expand All @@ -121,8 +183,8 @@ message LabelValue {
bool has_value = 2;
}

// A timestamped measurement.
message Point {
// Int64Value is a timestamped measurement of int64 value.
message Int64Value {
// start_time_unixnano is the time when the cumulative value was reset to zero.
// This is used for Counter type only. For Gauge the value is not specified and
// defaults to 0.
Expand All @@ -142,33 +204,65 @@ message Point {
// Value is UNIX Epoch time in nanoseconds since 00:00:00 UTC on 1 January 1970.
sfixed64 timestamp_unixnano = 2;

// The actual point value.
oneof value {
// A 64-bit integer.
int64 int64_value = 3;
// value itself.
int64 value = 3;
}

// A 64-bit double-precision floating-point number.
double double_value = 4;
// DoubleValue is a timestamped measurement of double value.
message DoubleValue {
// start_time_unixnano is the time when the cumulative value was reset to zero.
// This is used for Counter type only. For Gauge the value is not specified and
// defaults to 0.
//
// The cumulative value is over the time interval [start_time_unixnano, timestamp_unixnano].
// Value is UNIX Epoch time in nanoseconds since 00:00:00 UTC on 1 January 1970.
//
// Value of 0 indicates that the start_time is the same as that of the previous
// data point in this timeseries. When creating timeseries of this type it is recommended
// to omit this value if the start_time does not change, since it results in more
// compact encoding on the wire.
// If the value of 0 occurs for the first data point in the timeseries it means that
// the timestamp is unspecified. In that case the timestamp may be decided by the backend.
sfixed64 start_time_unixnano = 1;

// A histogram value.
HistogramValue histogram_value = 5;
// timestamp_unixnano is the moment when this value was recorded.
// Value is UNIX Epoch time in nanoseconds since 00:00:00 UTC on 1 January 1970.
sfixed64 timestamp_unixnano = 2;

// A summary value. This is not recommended, since it cannot be aggregated.
SummaryValue summary_value = 6;
}
// value itself.
double value = 3;
}

// Histogram contains summary statistics for a population of values. It may
// optionally contain the distribution of those values across a set of buckets.
message HistogramValue {
// start_time_unixnano is the time when the cumulative value was reset to zero.
// This is used for Counter type only. For Gauge the value is not specified and
// defaults to 0.
//
// The cumulative value is over the time interval [start_time_unixnano, timestamp_unixnano].
// Value is UNIX Epoch time in nanoseconds since 00:00:00 UTC on 1 January 1970.
//
// Value of 0 indicates that the start_time is the same as that of the previous
// data point in this timeseries. When creating timeseries of this type it is recommended
// to omit this value if the start_time does not change, since it results in more
// compact encoding on the wire.
// If the value of 0 occurs for the first data point in the timeseries it means that
// the timestamp is unspecified. In that case the timestamp may be decided by the backend.
sfixed64 start_time_unixnano = 1;

// timestamp_unixnano is the moment when this value was recorded.
// Value is UNIX Epoch time in nanoseconds since 00:00:00 UTC on 1 January 1970.
sfixed64 timestamp_unixnano = 2;

// The number of values in the population. Must be non-negative. This value
// must equal the sum of the values in bucket_counts if a histogram is
// provided.
int64 count = 1;
int64 count = 3;

// The sum of the values in the population. If count is zero then this field
// must be zero.
double sum = 2;
double sum = 4;

// A Histogram may optionally contain the distribution of the values in the
// population. The bucket boundaries are described by BucketOptions.
Expand Down Expand Up @@ -197,7 +291,7 @@ message HistogramValue {

// Don't change bucket boundaries within a TimeSeries if your backend doesn't
// support this.
BucketOptions bucket_options = 3;
BucketOptions bucket_options = 5;

message Bucket {
// The number of values in each bucket of the histogram, as described in
Expand Down Expand Up @@ -227,19 +321,38 @@ message HistogramValue {

// The sum of the values in the Bucket counts must equal the value in the
// count field of the histogram.
repeated Bucket buckets = 4;
repeated Bucket buckets = 6;
}

// The start_timestamp only applies to the count and sum in the SummaryValue.
message SummaryValue {
// start_time_unixnano is the time when the cumulative value was reset to zero.
// This is used for Counter type only. For Gauge the value is not specified and
// defaults to 0.
//
// The cumulative value is over the time interval [start_time_unixnano, timestamp_unixnano].
// Value is UNIX Epoch time in nanoseconds since 00:00:00 UTC on 1 January 1970.
//
// Value of 0 indicates that the start_time is the same as that of the previous
// data point in this timeseries. When creating timeseries of this type it is recommended
// to omit this value if the start_time does not change, since it results in more
// compact encoding on the wire.
// If the value of 0 occurs for the first data point in the timeseries it means that
// the timestamp is unspecified. In that case the timestamp may be decided by the backend.
sfixed64 start_time_unixnano = 1;

// timestamp_unixnano is the moment when this value was recorded.
// Value is UNIX Epoch time in nanoseconds since 00:00:00 UTC on 1 January 1970.
sfixed64 timestamp_unixnano = 2;

// The total number of recorded values since start_time. Optional since
// some systems don't expose this.
google.protobuf.Int64Value count = 1;
google.protobuf.Int64Value count = 3;

// The total sum of recorded values since start_time. Optional since some
// systems don't expose this. If count is zero then this field must be zero.
// This field must be unset if the sum is not available.
google.protobuf.DoubleValue sum = 2;
google.protobuf.DoubleValue sum = 4;

// The values in this message can be reset at arbitrary unknown times, with
// the requirement that all of them are reset at the same time.
Expand Down Expand Up @@ -269,6 +382,5 @@ message SummaryValue {
}

// Values calculated over an arbitrary time window.
Snapshot snapshot = 3;
Snapshot snapshot = 5;
}

0 comments on commit 6769b93

Please sign in to comment.