Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question] Prometheus Gauge in OpenTelemetry? #1139

Closed
otherview opened this issue Sep 8, 2020 · 4 comments · Fixed by #1165
Closed

[Question] Prometheus Gauge in OpenTelemetry? #1139

otherview opened this issue Sep 8, 2020 · 4 comments · Fixed by #1165
Labels
area:metrics Part of OpenTelemetry Metrics enhancement New feature or request help wanted Extra attention is needed pkg:SDK Related to an SDK package
Milestone

Comments

@otherview
Copy link

Hi !

Similar to #708 I'm trying to model a Gauge using otel-go.
Finding it hard to wrap my head around it, can someone shine some light on this?

The results seem to be accumulative either way I do it 🤔

Using prometheus client:

func recordMetrics() {
	go func() {
		for {
			rtt.Set(rand.Float64() * 100)
			time.Sleep(2 * time.Second)
		}
	}()
}

var (
	rtt = promauto.NewGauge(prometheus.GaugeOpts{
		Name: "RTT_things",
	})
)

func main() {
	recordMetrics()
	http.Handle("/metrics", promhttp.Handler())
	http.ListenAndServe(":2222", nil)
}

Using otel-go:

func initMeter() {

	exporter, err := prometheus.InstallNewPipeline(prometheus.Config{
		DefaultHistogramBoundaries: []float64{1, 100, 250, 500},
	})
	if err != nil {
		log.Panicf("failed to initialize prometheus exporter %v", err)
	}
	http.HandleFunc("/metrics", exporter.ServeHTTP)
	go func() {
		_ = http.ListenAndServe(":2222", nil)
	}()

	fmt.Println("Prometheus server running on :2222")
}

func main() {
	initMeter()

	meter := global.Meter("basicTest")
	ctx := context.Background()

	// RTT
	rttValue := metric.Must(meter).NewInt64UpDownCounter("rtt_meter")

	for {
		randVal := rand.Int63n(100)
		fmt.Println("rtt-> ", randVal)

		// either this
		// rttValue.Add(ctx, randVal)

		// or this
		meter.RecordBatch(ctx,
			[]label.KeyValue{},
			rttValue.Measurement(randVal),
		)

		time.Sleep(10 * time.Second)
	}
}

Rendering:

Prometheus server running on :2222
rtt->  10
rtt->  21
rtt->  37
----------
# HELP rtt_meter
# TYPE rtt_meter counter
rtt_meter 10
----------
# HELP rtt_meter
# TYPE rtt_meter counter
rtt_meter 31
----------
# HELP rtt_meter
# TYPE rtt_meter counter
rtt_meter 68

Thanks! 😃

@jmacd
Copy link
Contributor

jmacd commented Sep 10, 2020

Ah, now that I see your example, the ValueObserver instrument (asynchronous) is well suited to your needs. Instead of defining a loop and a periodic gauge being Set, you'll define a ValueObserver callback that will be called once per collection interval and output a classic Gauge in both OTLP and Prometheus exporters.

You'll have fewer goroutines and arbitrary sleep statements then. See open-telemetry/opentelemetry-specification#834

@otherview
Copy link
Author

otherview commented Sep 10, 2020

Thanks for the reply @jmacd ! ☺️

I'm still not able to achieve it though...
This is returning an accumulative histogram, whereas I just want it to spit out the value that is "set".

I feel like my approach to this is wrong 🤔 . Can you take a look ?

Given :

func main() {
	initMeter()

	meter := global.Meter("basicTest")
	ctx := context.Background()

	observerLock := new(sync.RWMutex)
	observerValueToReport := int64(0)

	// Latency
	cb := func(_ context.Context, result metric.Int64ObserverResult) {
		(*observerLock).RLock()
		value := observerValueToReport
		(*observerLock).RUnlock()
		result.Observe(value)
	}

	_ = metric.Must(meter).NewInt64ValueObserver("latency_meter", cb)

	for {
		randVal := rand.Int63n(100)
		fmt.Println("rtt-> ", randVal)

		(*observerLock).RLock()
		observerValueToReport = randVal
		(*observerLock).RUnlock()

		time.Sleep(10 * time.Second)
	}
}

This renders:

Prometheus server running on :2222
rtt->  10
rtt->  51
rtt->  21

--------
# HELP latency_meter
# TYPE latency_meter histogram
latency_meter_bucket{le="1"} 0
latency_meter_bucket{le="100"} 1
latency_meter_bucket{le="250"} 1
latency_meter_bucket{le="500"} 1
latency_meter_bucket{le="+Inf"} 1
latency_meter_sum 10
latency_meter_count 1

--------
# HELP latency_meter
# TYPE latency_meter histogram
latency_meter_bucket{le="1"} 0
latency_meter_bucket{le="100"} 3
latency_meter_bucket{le="250"} 3
latency_meter_bucket{le="500"} 3
latency_meter_bucket{le="+Inf"} 3
latency_meter_sum 71       -> 10          + 10        + 51
latency_meter_count 3      -> obs1       obs1       obs2

@nilebox
Copy link
Member

nilebox commented Sep 10, 2020

This is probably because Prometheus exporter is using simple selector:

simple.NewWithHistogramDistribution(config.DefaultHistogramBoundaries),

which uses histogram distribution for ValueObserver:

case metric.ValueObserverKind, metric.ValueRecorderKind:
aggs := histogram.New(len(aggPtrs), descriptor, s.boundaries)

The workaround for now is to implement a custom selector which uses "LastValue" aggregation for ValueObserver, this is what we did in Cloud Monitoring.

@MrAlias
Copy link
Contributor

MrAlias commented Sep 10, 2020

Talking about this in the SIG meeting today the proposal put forth was to include a selector similar to what @nilebox showed was done in the Google Cloud Monitoring exporter in the main project. Additionally add the functionality to the prometheus and cortext exporters to be configured with a user defined selector. That way, even though the default will still be the Histogram selector, it will allow users to solve this problem in a more standard way.

It was also brought up that this is likely something the Views API is well suited to address. The long-term solution would be to address this configuration there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:metrics Part of OpenTelemetry Metrics enhancement New feature or request help wanted Extra attention is needed pkg:SDK Related to an SDK package
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants