Skip to content
This repository has been archived by the owner on Oct 3, 2023. It is now read-only.

Stop including artificial 0 bound when existing lowest bound is negative #262

Merged
merged 1 commit into from
Jun 9, 2020
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion stats.go
Original file line number Diff line number Diff line change
Expand Up @@ -536,7 +536,7 @@ func newTypedValue(vd *view.View, r *view.Row) *monitoringpb.TypedValue {
}

func shouldInsertZeroBound(bounds ...float64) bool {
if len(bounds) > 0 && bounds[0] != 0.0 {
if len(bounds) > 0 && bounds[0] > 0.0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I digged a bit into the history behind this condition, and it seems like in OpenCensus the negative bounds were explicitly forbidden, see census-instrumentation/opencensus-go#963

You supposedly have encountered this issue in the OpenTelemetry Collector pipeline? Are negative bounds allowed there?

This specific condition was discussed in the original PR, see #89 (comment)

@rghetia @mayurkale22 do you happen to know more context on this, and possible difference between OpenCensus vs OpenTelemetry in that regard?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I came across this when trying to export Prometheus metrics (using OpenTelemetry Prometheus receiver) to Stackdriver (using Stackdriver exporter). The error I got came from Stackdriver Monitoring API, essentially saying that bounds[0] > bounds[1]. So that distribution came all the way through OpenTelemetry just fine:

rpc error: code = InvalidArgument desc = Field timeSeries[35].points[0].distributionValue had an invalid value: Distribution |explicit_buckets.bounds| entry 1 has a value of -4.94065645841247e-324 which is less than the value of entry 0 which is 0.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. @mayurkale22 claimed in census-instrumentation/opencensus-go#963 that both Prometheus and Stackdriver don't support negative bounds... Has that changed?

Either way, this seems like a harmless change.

If it turns out that Stackdriver indeed doesn't support negative values, we should explicitly validate the buckets in the exporter rather than relying on the OpenCensus / OpenTelemetry code being correct, i.e. fail earlier before invoking this function.

Copy link
Contributor Author

@x13n x13n Jun 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So from what I have seen I'm pretty sure Prometheus client libs have no problem exposing negative bounds. I'm not actually sure about Stackdriver, it is possible that this change will only turn one error into another. I'll check that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docs only seem to require that bucket bounds are increasing, nothing about forbidding negative buckets: https://cloud.google.com/monitoring/api/ref_v3/rest/v3/TypedValue#explicit

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually opentelemetry-proto also doesn't support negative bounds.
/cc @bogdandrutu

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. So, next culprit: Prometheus receiver. I saw negative bounds on Prometheus endpoint and that was ingested into OpenTelemetry Collector via Prometheus receiver just fine. Should the negative bound be dropped there instead?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the -ve bound is a legitimate use case then we have to address that in all relevant libraries. If it is not then it is okay to drop it OR with an optional flag allow -ve bounds.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the right thing to do is to allow negative bounds, they are not really useful for things like request latencies, but may have other use cases (e.g. Prometheus docs give temperatures as an example).

I just created open-telemetry/opentelemetry-proto#156 to extend the histogram support to negative bucket bounds.

Can this change be merged though? If OT won't end up extending histogram support to cover negative values, this change should be a no-op. If it does, this change will make that compatible with the updated model.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am fine with merging this change. However it will still be incorrect in the following case.
"If the distribution has lowest bound >0 AND if it includes -ve values then this change distorts the distribution by adding additional '0' bound."
May be add this to the comment and link to the above issue.

return true
}
return false
Expand Down