-
Notifications
You must be signed in to change notification settings - Fork 258
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal for metric proto refactor #137
Conversation
Update metric descriptor to contain information on the data qualities. This includes: - The meaning of the time interval measurements were made over is now described by the `interval` value. - The monotonic nature of the data is captured if known. - If the data values are restricted to a subset of numbers. The `SummaryDataPoint` and `HistogramDataPoint` were merged into a unified `Distribution` message that contains statistics about the observed population of values. Adds support for multiple histogram bucket definitions and adds a linear bucket definition message. Includes an explicit minimum and maximum for a `Distribution` (open-telemetry#122). Removes the exemplar code (open-telemetry#81). This can be added back in as a more generalized case now given the new `Data` structure of the Metrics. But, it has been left for a separate PR. Unifies the common parts of the previous `*DataPoints` into a unified `Data` message. This is not complete. It likely needs better names and it certainly needs comments. Related to open-telemetry#125 Fixes to open-telemetry#81
I've marked this as a draft as this is still pretty rough and I expect to review and iterate on this. However, I would still appreciate anyone who has participated on #125 to take a look (@tigrannajaryan, @c24t , @jkwatson , @jmacd ) and provide any thoughts you have. |
int64 value = 4; | ||
} | ||
// Subset describes the subset of numbers metric values belong to if any. | ||
enum Subset { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about NumericQuality
to go along with the proposed TemporalQuality
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@MrAlias any opinion on flattening this enum and making each instrument refinement like this a flag?
// monotonic_increasing = 0x1
// monotonic_decreasing = 0x2
// subset_nonnegative = 0x4
// subset_nonpositive = 0x8
// ...
optional uint32 refinements;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@c24t : When I looked into implementing the refinement
approach I wasn't sure how to restrict mutually exclusive refinements using the flag system. Using the enums like the current approach builds sets restrictions using syntax, but I'm not sure how to achieve the same otherwise (i.e. you could set both monotonic_increasing
and monotonic_decreasing
).
Do you have a way in mind to prevent this with this kind of implementation? Want to make sure I didn't miss something (because I like the uniformity of this approach).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAIK there's no way to to prevent invalid bitsets in the proto itself, I assume it'd be up to the collector to return some sensible error.
One benefit of using flags instead of enums is that we can add refinements without updating the proto package. (Whether this is a benefit or a detriment maybe depends on your use case: it would be very bad news if the client and the collector interpreted this field differently!) But this is only true if the proto doesn't know anything about refinement semantics.
I don't have a strong opinion on this vs. enums, just pointing out that there's a tradeoff here between safety and development speed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for starting this @MrAlias, I think this is going to force us to answer some hard questions about metrics aggregation.
My comments are more questions than suggestions, but I think this is the right direction.
To make sure I understand this, my assumptions about this PR are:
- We have a user-facing API that includes several metric instrument types. An instrument type determines which aggregations are valid via its "refinements". E.g. an instrument without the "monotonic" refinement can't be aggregated to produce a rate.
- Exporters consume aggregated data. The exporter data model includes types like
MinMaxSumCount
, and exporters are expected to be able to convert from each internal type to a backend-specific (or "exposition") format. - We want to send aggregated data to the collector via the collector exporter, and then export data from the collector via various backend-specific exporters.
- Therefore this proto should match (or be at least as expressive as) our internal exporter data model.
@jmacd mentioned an "exact" aggregation in https://github.com/open-telemetry/opentelemetry-specification/pull/347/files#diff-5b01bbf3430dde7fc5789b5919d03001R254-R259 that exports values in an array instead of aggregating them. I don't know whether we're still planning to support this, but if so would you expect to send each value as a separate Data
(all with the same labels) or create a new Value
type that can store multiple values?
} | ||
// monotonic is true if the metric values are entirely non-increasing, or | ||
// entirely non-decreasing. | ||
bool monotonic = 5; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This raises a question I had about open-telemetry/oteps#88: are instrument refinements meant to be passed to exporters, or just describe the values that different metric instruments can accept?
these refinements serve to define the set of instruments. Both users and exporters will deal in concrete kinds of instrument, these refinements are just for explaining their properties.
Assuming the goal is to use the same data model here as in regular in-process exporters, this suggests we should be sending the metric instrument type instead of the set of refinements. But it's not clear to me that this is the right approach -- it may make sense to use the metric instruments from open-telemetry/oteps#88 in the user-facing API and use a different data model for exporters.
Unrelated to the comment above: what's the use case for non-increasing metrics?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's easier to describe if we encode the concrete types as an enum, but less forward-compatible if we add more and more instruments. It's harder to describe if we include refinements individually. (I'm still thinking about this...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a great question. Don't want to leave it unaddressed, but I'm still contemplating the answer to this one.
int64 value = 4; | ||
} | ||
// Subset describes the subset of numbers metric values belong to if any. | ||
enum Subset { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@MrAlias any opinion on flattening this enum and making each instrument refinement like this a flag?
// monotonic_increasing = 0x1
// monotonic_decreasing = 0x2
// subset_nonnegative = 0x4
// subset_nonpositive = 0x8
// ...
optional uint32 refinements;
fixed64 time_unix_nano = 3; | ||
|
||
// Value is the measured value. | ||
oneof Value { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This means metric.data
might have a mix of value types right? Vs. https://github.com/open-telemetry/opentelemetry-proto/pull/137/files#diff-ad849336b0a70e8a18346421908f69edL103-L106.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm debating whether the descriptor should indicate the expected Data field, or the Data struct should do so itself. I can see it both ways. It's simpler if the descriptor identifies a constant field to be expected. I've also described how I believe that we should be able to report single-value distributions using a single value, which means that an observer reporting a distribution should report a scalar value when there's no aggregation, but should report a distribution value when there is aggregation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it currently does. I think (for a variety of reasons) I'm going to go back to a format like before.
I think it is still possible in the old format for the sender to send a mix of data, but with the distinct types it is likely easier to ignore this error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jmacd : That's a good point. I'm not sure either. I can see the inclusion of the type in the Data
itself allowing for simpler extension of a metric instrument (likely going to be really useful when there are multiple views on the same instrument), but also how it might be easier to reason about distinct metric descriptor that fully defines the data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pushed a change to put this info in the descriptor. If you come to a different conclusion @jmacd, let me know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the proposal. I don't know enough about metrics to tell if this matches OpenTelemetry semantics, so I'll only comment on the performance aspect of the schema.
I left a few comments about oneof
, about field alignment and about extra embedded messages. Based on these I am guessing no benchmarking was done (if it was done please post the results, comparing to current OTLP and to OpenCensus).
The proto that we have so far was very carefully benchmarked to get the most performance both from CPU and RAM usage perspective and from wire size perspective. It is important that the same is done with the new proposal (assuming it is semantically correct) and the schema is optimized to get the most performance.
} | ||
// monotonic is true if the metric values are entirely non-increasing, or | ||
// entirely non-decreasing. | ||
bool monotonic = 5; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The way field alignment works in Go (and probably in other languages) it is important to order the fields in a message in a way that avoids unnecessary gaps. Having bool
packed between 2 enums which have 32bit alignment will result in 3 wasted bytes.
I may be wrong on my mental math of field alignments, so please benchmark - it shows memory usage too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tigrannajaryan I think protobuf go should do this optimization internally. If this is the case probably worth filling a bug to the protobuf library they should pack the members in the best way to minimize memory usage.
I would not try right now to optimize on that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bogdandrutu in memory representation of a ProtoBuf message is a Go struct
. If I remember correctly the order of the fields in the struct matches their order in the ProtoBuf declaration. I do not think ProtoBuf generator reorders fields in order to reduce wasted memory due to alignment requirements.
I do not think it should be considered a ProtoBuf generator bug. I would expect that the struct fields preserve the order otherwise readability of the code will be impacted.
This is lower priority issue but at least I would check the memory usage to see how much we are wasting. Perhaps it is negligible.
message Data { | ||
// The set of labels that uniquely identify this timeseries. | ||
// | ||
// These labels take precedence over any labels with the same key defined |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May we treat a duplicate label key found in the Data vs. the Descriptor as an error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you're correct and we should treat this as an error.
Do you think we should specify a behavior for when this error occurs? Like the receiver will disregard? Or maybe, the behavior is undetermined and up to the receiver to handle?
// This time MUST be reset so as to represent a new measurement lifetime | ||
// if this data point is part of a timeseries for a CUMULATIVE metric and | ||
// the metric value is reset to zero. | ||
fixed64 start_time_unix_nano = 2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've read and re-read this explanation, and I think you're saying what I think we want, but we should clarify this, especially for cumulative values. I found a helpful explanation in Stackdriver's documentation:
https://cloud.google.com/monitoring/api/ref_v3/rest/v3/TimeSeries#point
The time interval to which the data point applies. For GAUGE metrics, the start time is optional, but if it is supplied, it must equal the end time. For DELTA metrics, the start and end time should specify a non-zero interval, with subsequent points specifying contiguous and non-overlapping intervals. For CUMULATIVE metrics, the start and end time should specify a non-zero interval, with subsequent points specifying the same start time and increasing end times, until an event resets the cumulative value to zero and sets a new start time for the following points.
For DELTA instruments: can we recommend that start and end times equal the start and end time of the interval? If so, in a simple configuration where all metrics are collected on the same interval, all Instantaneous and Delta measurements can infer their end (Instantaneous) or start/end (Delta) time from the report and be skipped in the Data field. At least I think we should have an option to leave out timestamps when they can be inferred from the report.
We have to pay particular attention to how cumulatives are reported. The start-time of a cumulative should remain constant as long as the cumulative is not reset, as stated in the Stackdriver docs. It means that the SDK will be responsible for remembering cumulative values that are actively being reported. When some number of intervals pass without reporting that cumulative, the SDK may forget it, as follows.
- Remember the end timestamp fo the last interval that reported the metric
- Reset all values for the instrument, retain only the memory of the end timestamp
- Any new cumulative value for the instrument (i.e., distinct label set) will be reported as having a start time equal to the last timestamp where the instrument was reset.
Item (2) ensures that we can aggregate cumulative values for a single metric within the process, since all values will have the same reset time. This does not ensure that multiple instruments can be aggregated in the process: an error condition would arise as we try to aggregate multiple cumulative values with different star times. The same condition may arise in a distributed setting, where if multiple processes report the same metric with different cumulative start times, aggregation is not possible. I suggest we leave it at that, there is a known condition where cumulatives cannot be aggregated except when they have the same start time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the suggestions, they are much appreciated.
I've updated the time interval handling. PTAL and let me know if you have any other suggestions.
// If this data point is a part of a timeseries for an DELTA, or | ||
// CUMULATIVE metric, this value represents the instant before the | ||
// measuring time interval began (i.e. for time t to be in the measuring | ||
// interval than start_time_unix_nano < t <= time_unix_nano). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This start < t <= time
bothers me. Why not start <= t < time
? I state this because it's possible for an event to happen concurrently with the start of a time window and be recorded in that window, but it's not possible for an event to happen concurrently with the end of a window and be included in the window.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the way how "time" is implemented in the current systems cannot guarantee a happens-before relationship so I would not bother to justify what should happen if an event happens at that same moment - unless there is another way to synchronize these events time cannot guarantee the ordering of the events.
I state this because it's possible for an event to happen concurrently with the start of a time window and be recorded in that window, but it's not possible for an event to happen concurrently with the end of a window and be included in the window.
If an event happens at the same time (if you read time.Now
exactly when the event happens) with closing the time we close the window (if you read time.Now
) there is no guarantee if the event will be included in the current window or the next window unless there are other mechanisms to ensure the happens before relationship, so none of the option is ideal.
The idea behind using this definition of the (start, end] is that when using an Observer the value is "added" at the end time so it feels kind of more natural this way - again none of the versions can be 100% correct, so we also look at what others do like OpenMetrics/Stackdriver those being the system which have start time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we also look at what others do like OpenMetrics/Stackdriver
@bogdandrutu : where is this defined for OpenMetrics?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, Stackdriver is an inclusive interval: https://cloud.google.com/monitoring/api/ref_v3/rest/v3/TimeInterval
E.g. [start, end]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at a token example of a check-pointing process:
=========================
|| Checkpointer ||
=========================
-------------------------
| Start: t_0 |
-------------------------
| -----------------
| <-------------------------| measurement_0 |
| -----------------
-------------------------
| Checkpoint 1 |
| --------------------- | -----------------
| Begin Checkpoint: t_1 | <--------| measurement_1 |
| | -----------------
| Processing... |
| | -----------------
| | <--------| measurement_2 |
| | -----------------
-------------------------
| -----------------
| <-------------------------| measurement_3 |
| -----------------
-------------------------
| Checkpoint 2 |
| --------------------- |
| Begin Checkpoint: t_2 |
| |
| Processing... |
| |
-------------------------
|
...
If start <= t < time
:
- Checkpoint 1 (
t_0 <= t < t_1
) includes:- measurement_0
- Checkpoint 2 (
t_1 <= t < t_2
) includes:- measurement_1, measurement_2, measurement_3
If start < t <= time
:
- Checkpoint 1 (
t_0 < t <= t_1
) includes:- measurement_0, measurement_1
- Checkpoint 2 (
t_1 < t <= t_2
) includes:- measurement_2, measurement_3
(side note: with this interval definition, start is not included in any interval. This is not ideal, but likely minor enough to not be a real problem in the real world)
@jmacd : From my understanding of you comment, you are saying that the t_0 < t <= t_1
definition of the interval means that while Checkpoint 1 should contain both measurement_0 and measurement_1, it is not possible to do in practice. Mainly because the checkpoint process is no longer receiving measurements when t_1
is defined by the process, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I talked to one of the StackDriver authors and they explained to me that they actually don't want to have inclusive interval because that may cause duplicate "report" (they also said that in practice nobody does implement that way - if an event happens at exactly same time as the endtime of the interval it will not be included in both deltas, the decision to be in the current delta or the next delta happens using other synchronization mechanisms that ensure a happens before relationship and ensures events are counted only once).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I talked to one of the StackDriver authors and they explained to me that they actually don't want to have inclusive interval because that may cause duplicate "report" (they also said that in practice nobody does implement that way - if an event happens at exactly same time as the endtime of the interval it will not be included in both deltas, the decision to be in the current delta or the next delta happens using other synchronization mechanisms that ensure a happens before relationship and ensures events are counted only once).
Thanks for the link and info. We should not consider a revision of an open PR and undocumented behavior given in a conversation to be prior art. Furthermore, I think the point @jmacd made is a technical one and I think we should engage the point on that level.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@MrAlias I stand back to my previous comment, using time to determine a happens-before relationship is almost impossible (or very hard to implement unless we are going to implement something like true-time). So I am not sure I understand your example, and how would it be possible for an event to happen before we start the checkpointer.
I think the discussion about inclusive vs exclusive start time is important if we allow recording an event while creating the instrument, which I think we don't allow, otherwise all events will fit in one of the deltas that we produce.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bogdandrutu: Hmm, it seems like we might be talking past each other. I'm wondering if this is stemming from the differences in concurrency design in the implementations we work in.
Maybe this is something we can talk about in the upcoming Metric SIG meeting? I'll add it to the agenda if you can make the meeting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, but this became impossible to review, the amount of changes is too big and too many changes in one PR. I would try to review but reject this PR to be merged and encourage a split in smaller PRs.
} | ||
// monotonic is true if the metric values are entirely non-increasing, or | ||
// entirely non-decreasing. | ||
bool monotonic = 5; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tigrannajaryan I think protobuf go should do this optimization internally. If this is the case probably worth filling a bug to the protobuf library they should pack the members in the best way to minimize memory usage.
I would not try right now to optimize on that.
} | ||
// monotonic is true if the metric values are entirely non-increasing, or | ||
// entirely non-decreasing. | ||
bool monotonic = 5; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not enough for systems like Prometheus/Stackdriver/(probably others) which expect only monotonic increasing. How can an exporter for Prometheus/Stackdriver know if it is monotonic increasing or decreasing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this and the subset value define the concept you are talking about.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, just re-read. 😑
I see the omission you are talking about. I'll be sure to distinguish monotonic increasing/decreasing.
message ValueAtPercentile { | ||
// Distribution is a data point in a timeseries containing statistics for | ||
// an observed population of values. | ||
message Distribution { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure I understand why are these merged. Also can be done in a separate PR.
// added to the lifetime total of the metric. | ||
// | ||
// Each reported value for a CUMULATIVE metric is the sum of all | ||
// measurements up to and including that one. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this sentence is accurate, as it reads almost like a math sequence V_n = Sum_{i=1..n-1} V_i
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That formula is mostly correct based on my understanding of what we are trying to convey. Only thing I would add is that the latest measurement is included in the sum. I.E.
Value_n = Sum_{i=1...n} Measurement_i
Do you have a different way of understanding what a CUMULATIVE metric is, or maybe a better way of stating this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does CUMULATIVE only applies to sums?
Your formula is still confusing (note that mine didn't have observations at all, it was purely a deterministic math sequence), because Value
refers to reported values, which can have multiple measurements in-between. E.g. if we're reporting request count every second, but in the API we're calling counter.Inc(1)
for every request, then I expect Measurement_i == 1
, while Value_j
are cumulative counts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does CUMULATIVE only applies to sums?
Good point. No, I don't think it does in theory. Likely in practice it does, but I'll change the sentence to be more general.
I don't want to let the discussion of Sum
s distract from the underlying question you have about the sentence. To that end maybe we can use your example to make sure we share a common understanding of what a CUMULATIVE value is. To expand on it:
- The system starts.
- A request comes in, we call
counter.Inc(1)
. - Another request comes in, we call
counter.Inc(1)
. - Another request comes in, we call
counter.Inc(1)
. - The 1 second collection cycle ends. A metric is created for the export pipeline with a value of
3
. - A request comes in, we call
counter.Inc(1)
. - Another request comes in, we call
counter.Inc(1)
. - The 1 second collection cycle ends. A metric is created for the export pipeline with a value of
5
(2
from this collection cycle,3
from the previous totaling5
(assuming a sum aggregation)). - A request comes in, we call
counter.Inc(1)
. - The 1 second collection cycle ends. A metric is created for the export pipeline with a value of
6
The value for the metric in this example is said to be CUMULATIVE. Does this sound correct to you?
If so is there a way you think we can rephrase the sentence to better describe this behavior?
If not, can you provide an revision of the above example that matches what you are thinking?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you want to generalize from sum, then it's an aggregation of all measurements from some fixed point in time up to the last reporting snapshot. And I think it's fine to provide a small example as well.
But I am still curious what aggregations other than a sum this could be used for.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you want to generalize from sum, then it's an aggregation of all measurements from some fixed point in time up to the last reporting snapshot. And I think it's fine to provide a small example as well.
👍
But I am still curious what aggregations other than a sum this could be used for.
Currently there are not any alternate aggregations (nor do I anticipate there ever being). I have seen monitoring systems in the past measure cumulative averages and rates. Both of which are pretty useless IMO.
The thing I don't want to do is imply is that there is a predefined aggregation that should be assumed if the the Temporality
is cumulative. That would incorrectly couple the concepts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated, PTAL.
Co-Authored-By: Yuri Shkuro <[email protected]>
// The percentile of a distribution. Must be in the interval | ||
// [0.0, 100.0]. | ||
double percentile = 1; | ||
|
||
// The value at the given percentile of a distribution. | ||
double value = 2; | ||
} | ||
repeated Percentile percentiles = 7; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this really be a first-class citizen? Since percentiles cannot be aggregated, supporting it directly sounds like encouraging bad practices.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. It is influenced by both the previous proto definition that had a Summary
type to support this, and both the Prometheus' Summary
and part of Stackdriver's Distribution
.
I whole hardily agree these lead users to make nonsense aggregations, but was keeping it around to avoid the opposite question of why did you remove them.
If there is a consensus for not including these I'm 100% onboard to dropping.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To add to this, in my experience, I have seem limited use of Prometheus Summary
s for this exact reason. It is much more flexible to send the distribution data as a histogram and calculate a close approximation of an arbitrary percentile when needed (not when measured).
I think if you wanted to optimization of sending percentiles, you could send them as Int64 or Double data on your own.
Co-Authored-By: Yuri Shkuro <[email protected]>
… into metric-proto
Remove oneof and add Type field to the MetricDescriptor to specify what values to expect in the metric.data.
repeated SummaryDataPoint summary_data_points = 5; | ||
repeated Data data = 2; | ||
|
||
message Data { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having one message Data with multiple value types is less efficient than having multiple data points, because the generated class/struct in languages like go, java, c++ will be larger (will include all value types). So I suggest we stick with the previous definition of data points and deal with duplicate definition for labels, start/end timestamp.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please update the diagram in the comment preceding the Metric to reflect the new reality. I think the diagram is no longer correct.
Worked with @bogdandrutu on a plan forward by breaking up the changes included here into smaller PRs. I'm going to close this to avoid confusion. If there are further general questions/suggestion about the approach feel free to continue commenting here. Thank you everyone who provided feedback, it was greatly appreciated. |
Update metric descriptor to contain information on the data qualities.
This includes:
interval
value.The
SummaryDataPoint
andHistogramDataPoint
were merged into a unifiedDistribution
message that contains statistics about the observed population of values.Adds support for multiple histogram bucket definitions and adds a linear bucket definition message.
Includes an explicit minimum and maximum for a
Distribution
(#122).Removes the exemplar code (#81). This can be added back in as a more generalized case now given the new
Data
structure of the Metrics. But, it has been left for a separate PR.Unifies the common parts of the previous
*DataPoints
into a unifiedData
message.This is not complete:
I wanted to get this posted sooner rather than later to not let perfection hinder progress.
Related to #125
Fixes to #81