Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is a PR for open-telemetry/opentelemetry-specification#982. Main changes
Terminology note: For exponential histograms, "base" is used for log base, and exponent base, consistent with standard math terminology. "Reference" is used for the multiplier on exponential scale, consistent with common usage in log scale unit such as deci bell.
Compared to custom protocol for DDSketch (https://github.com/DataDog/sketches-java/blob/master/src/main/proto/DDSketch.proto), a DDsketch created using the logarithm method can be represented as reference=1, base=gamma, index_offset=contiguousBinIndexOffset. DDSketch created using log approximation methods such as quadratic or cubic methods has to use the explicit bound encoding.
To encode quadratic or cubic methods is simple. We can just add an "approximation method" field. But this would require all backend consumers of this protocol to properly decode bucket bounds encoded this way. While linear subbuckets is easy to understand and implement, quadratic or cubic methods are not. There is no doc on mathematical description of the exact formula used. In fact, there are many quadratic and cubic approximation methods for log. None could be considered "canonical". Simply say "quadratic" or "cubic" does not tell the backend how to process the data. Thus I hesitate to include such methods into standard.
The proposed protocol gives users two options for efficient exponential histogram encoding:
I consider this to be a good balance on choice and complexity in a standard.