Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

problems with generated exponential histogram #3767

Open
loomis-relativity opened this issue Mar 8, 2024 · 5 comments
Open

problems with generated exponential histogram #3767

loomis-relativity opened this issue Mar 8, 2024 · 5 comments
Labels
bug Something isn't working

Comments

@loomis-relativity
Copy link

loomis-relativity commented Mar 8, 2024

Describe your environment

Running with:

  • MacBook Pro Intel
  • MacOS 14.3.1 (intel)
  • Python 3.11.7 (homebrew)

Requirements:

opentelemetry-api>=1.23.0
opentelemetry-sdk>=1.23.0
opentelemetry-proto>=1.23.0
opentelemetry-exporter-otlp>=1.23.0

Steps to reproduce

Run main.py (below) and capture the dump of the exponential histogram.
Source code:

from opentelemetry import metrics
from opentelemetry.exporter.otlp.proto.http.metric_exporter import OTLPMetricExporter
from opentelemetry.sdk.metrics import MeterProvider
from opentelemetry.sdk.metrics.export import PeriodicExportingMetricReader

def main():

    exporter = OTLPMetricExporter()
    meter_provider = MeterProvider([PeriodicExportingMetricReader(exporter)])
    meter = metrics.get_meter(
        name="test-histogram",
        meter_provider=meter_provider,
    )
    histo = meter.create_histogram("test-histogram")
    for i in range(1000):
        for _ in range(i):
            histo.record(i, {"type": "test-histogram"})

    meter_provider.shutdown()

if __name__ == "__main__":
    main()

This is run with the following script:

export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318
export OTEL_EXPORTER_OTLP_METRICS_TEMPORALITY_PREFERENCE=DELTA
export OTEL_EXPORTER_OTLP_METRICS_DEFAULT_HISTOGRAM_AGGREGATION=base2_exponential_bucket_histogram

python3 ./main.py

What is the expected behavior?

I expected to see a dump like that generated for the JavaScript SDK in
dump-js.txt.

What is the actual behavior?

The generated exponential histogram in dump-python.txt seems to have the following problems:

  1. There is a negative bucket, even though no negative values are added to the histogram. (This actually causes a downstream vendor to reject the histogram.)
  2. All 160 buckets are provided even though the last 79 are all empty. Other SDKs (e.g. Javascript, dump-js.txt) trim unnecessary buckets, reducing the size of the payload.

Additional context

None.

@loomis-relativity loomis-relativity added the bug Something isn't working label Mar 8, 2024
@loomis-relativity
Copy link
Author

One naive option would be to modify this block of code in the exporter:

                        if data_point.positive.bucket_counts:
                            buckets = nonzero_slice(data_point.positive.bucket_counts)
                            if buckets:
                                positive = pb2.ExponentialHistogramDataPoint.Buckets(
                                    offset=data_point.positive.offset,
                                    bucket_counts=buckets,
                                )
                            else:
                                positive = None
                        else:
                            positive = None

                        if data_point.negative.bucket_counts:
                            buckets = nonzero_slice(data_point.negative.bucket_counts)
                            if buckets:
                                negative = pb2.ExponentialHistogramDataPoint.Buckets(
                                    offset=data_point.negative.offset,
                                    bucket_counts=buckets,
                                )
                            else:
                                negative = None
                        else:
                            negative = None

also adding the function:

def nonzero_slice(lst):
    for i, value in enumerate(reversed(lst)):
        if value != 0:
            return lst[0:len(lst)-i-1]
    return []

This produces a dump file (dump-corrected-python.txt) that has no negative buckets and the positive buckets include only the lowest, non-zero buckets.

There is some overhead in searching for the last non-zero bucket, but it does produce a more compact representation of the payload (which also works with the downstream vendor). If this would be acceptable to the maintainers, I'm happy to code up a PR with this change.

@alexchowle
Copy link

alexchowle commented Nov 25, 2024

I'm also seeing the issue where my downstream vendor is rejecting the negative buckets - confirmed in v1.27.0. It seems to be the mere presence of a negative bucket with no observations in it. I can confirm that there are no negative recordings in my sample:

                                        "start_time_unix_nano": 1732543695206206909,
                                        "time_unix_nano": 1732543695965409860,
                                        "count": 1,
                                        "sum": 1.0023915767669678,
                                        "scale": 20,
                                        "zero_count": 0,
                                        "positive": {
                                            "offset": 3613,
                                            "bucket_counts": [
                                                1
                                            ]
                                        },
                                        "negative": {
                                            "offset": 0,
                                            "bucket_counts": [
                                                0
                                            ]
                                        },
                                        "flags": 0,
                                        "min": 1.0023915767669678,
                                        "max": 1.0023915767669678,
                                        "exemplars": []

@alexchowle
Copy link

FWIW - the same setup but using the Go SDK doesn't cause my vendor to drop the negatives.

@xrmx
Copy link
Contributor

xrmx commented Nov 25, 2024

@alexchowle Any chance you can figure out what the differences are between the python and go sdk exported data?

@alexchowle
Copy link

alexchowle commented Nov 25, 2024

@alexchowle Any chance you can figure out what the differences are between the python and go sdk exported data?

Tricky to do precisely with no JSON-over-HTTP support, but I've captured some data using an OTel Collector and its debugexporter as a proxy. Both create an exponential histogram and add a 1.0 observation to it:

Python

2024-11-25T17:59:01.746Z        info    ResourceMetrics #0
Resource SchemaURL: 
Resource attributes:
     -> service.name: Str(pyotelworkbench)
     -> service.version: Str(0.1.0)
ScopeMetrics #0
ScopeMetrics SchemaURL: 
InstrumentationScope __main__ 
Metric #0
Descriptor:
     -> Name: my_histogram
     -> Description: just a histogram
     -> Unit: s
     -> DataType: ExponentialHistogram
     -> AggregationTemporality: Delta
ExponentialHistogramDataPoints #0
StartTimestamp: 2024-11-25 17:58:51.58437709 +0000 UTC
Timestamp: 2024-11-25 17:59:01.584546571 +0000 UTC
Count: 1
Sum: 1.000000
Min: 1.000000
Max: 1.000000
Bucket [-1.000001, -1.000000), Count: 0
Bucket (0.999999, 1.000000], Count: 1
        {"kind": "exporter", "data_type": "metrics", "name": "debug"

Go

2024-11-25T18:04:34.871Z        info    ResourceMetrics #0
Resource SchemaURL: https://opentelemetry.io/schemas/1.26.0
Resource attributes:
     -> service.name: Str(gotelworkbench)
     -> service.version: Str(0.1.0)
     -> telemetry.sdk.language: Str(go)
     -> telemetry.sdk.name: Str(opentelemetry)
     -> telemetry.sdk.version: Str(1.32.0)
ScopeMetrics #0
ScopeMetrics SchemaURL: 
Descriptor:
     -> Name: my_histogram
     -> Description: just a histogram
     -> Unit: 
     -> DataType: ExponentialHistogram
     -> AggregationTemporality: Delta
ExponentialHistogramDataPoints #0
StartTimestamp: 2024-11-25 18:04:24.774677796 +0000 UTC
Timestamp: 2024-11-25 18:04:34.783361725 +0000 UTC
Count: 1
Sum: 1.000000
Min: 1.000000
Max: 1.000000
Bucket (0.999999, 1.000000], Count: 1
        {"kind": "exporter", "data_type": "metrics", "name": "debug"}

The Python implementation creates a negative bucket, whereas the Go does not.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants