Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Calling otelgrpc.UnaryClientInterceptor repeatedly creates multiple Int64Histogram resulting in a memory leak #4226

Closed
strideynet opened this issue Aug 21, 2023 · 9 comments · Fixed by open-telemetry/opentelemetry-go#4820
Assignees
Labels
area: instrumentation Related to an instrumentation package bug Something isn't working instrumentation: otelgrpc
Milestone

Comments

@strideynet
Copy link

Description

Apologies, I'm not overly familiar with the Metrics side of OpenTelemetry - I'm also slightly unsure if the bug lays within this repository or in the otel API/SDK for Go itself.

We have a use case where we create and destroy gRPC clients throughout the long lifetime of our application. When we create new gRPC clients, we provide otelgrpc.UnaryClientInterceptor() as an interceptor. We've noticed slow memory leaks in our application and profiling reveals a large amount of the heap allocated by go.opentelemetry.io/otel/internal/global.(*meter).Int64Histogram. Looking into the implementation, it appears that whilst repeatedly calling MeterProvider.Meter with the same name returns the same Meter, repeatedly calling Meter.Int64Histogram with the same name creates a new Int64Histogram each time. All of these are tracked internally in the instruments slice of global.meter and this is where the memory leak is sitting.

Our interim solution has been to use a sync.OnceValue to only create the interceptors one during the lifetime of the application and to share these across the clients. It would be helpful to either have it documented that you should only create a single set of interceptors throughout the application lifetime, or to have the underlying issue resolved.

image

Environment

  • OS: MacOS/Linux
  • Architecture: x86, arm64
  • Go Version: 1.21
  • otelgrpc version: v0.42.0

Steps To Reproduce

  1. Create and close a large number of gRPC clients with the UnaryClientInterceptor().
  2. Use pprof to confirm that heap is still in use for the Int64Histogram instruments created by UnaryClientInterceptor()

Expected behavior

The Int64Histogram to be reused or for a warning to be included in documentation regarding calling UnaryClientInterceptor repeatedly.

@johnrutherford
Copy link

I encountered the same issue using the otelhttp middleware with gorilla/mux.

When using the otelhttp middleware on the router (example below), middleware.createMeasures() ends up getting called on every request.

router := mux.NewRouter()
router.Use(otelhttp.NewMiddleware(""))

This results in the same memory leak.

pprof-screenshot

I've worked around the issue for now by using a no-op MeterProvider.

@codyoss
Copy link

codyoss commented Dec 27, 2023

If only one middleware/Handler should be created for the lifetime of the application it feels like the sync.Once should be done within this library.

@dashpole
Copy link
Contributor

dashpole commented Jan 5, 2024

I'll take a look at this.

@dashpole dashpole self-assigned this Jan 5, 2024
@quartzmo
Copy link

See external fix in googleapis/google-api-go-client#2329

@dashpole
Copy link
Contributor

I don't think we can do sync.Once, since the interceptors/stats handlers/etc. can be created with different MeterProviders, and we would want once instrument for each MeterProvider. I think this needs to be solved within the SDK itself.

My current attempt to fix it: open-telemetry/opentelemetry-go#4820. Feedback is welcome.

@dashpole
Copy link
Contributor

dashpole commented Jan 24, 2024

This will be fixed in the next otel-go release, since open-telemetry/opentelemetry-go#4820 merged.

@strideynet
Copy link
Author

When moving from 0.47.0 to 0.49.0, we now see an extremely similar leak with otelhttp - this seems to have been caused by #4707:

image

We still have our patch in place for otelgrpc so I'm unable to comment on if open-telemetry/opentelemetry-go#4820 resolved this.

	go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc v0.49.0
	go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp v0.49.0
	go.opentelemetry.io/otel v1.24.0
	go.opentelemetry.io/otel/exporters/otlp/otlptrace v1.24.0
	go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc v1.24.0
	go.opentelemetry.io/otel/sdk v1.24.0
	go.opentelemetry.io/otel/trace v1.24.0
	go.opentelemetry.io/proto/otlp v1.1.0

We're going to move ahead with the same workaround we used for otelgrpc.

@strideynet
Copy link
Author

I can't seem to reopen this issue - should I open a new one ?

@dmathieu
Copy link
Member

dmathieu commented Mar 1, 2024

Yes, please since it's not the same components.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: instrumentation Related to an instrumentation package bug Something isn't working instrumentation: otelgrpc
Projects
None yet
7 participants