-
Notifications
You must be signed in to change notification settings - Fork 452
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Perf bottleneck when Loggers upgrade weak ref to LoggerProvider #1209
Comments
I think for |
@shaun-cox this seems like it might be of interest to you. |
Yes!! I did consult (offline) @shaun-cox and @lalitb a lot to narrow down the bottleneck to this point and also for the potential fix! |
What is the implication of that? i.e What if Also, is that something which users can resolve by ensuring they call shutdown themselves at the app-end? |
In current design, shutdown_tracer_provider method removes the reference of the currently active TracerProvider from the global singleton, which in-turn causes the destructor/drop to be called for this removed TracerProvider instance, which further shutdowns all the processors. With the change to
In otel-cpp, TracerProvider maintains the list of all the created Tracers, so also controls it's lifetime. But I think this is pure design choice. In Rust, we can let TracerProvider instance remain active in shutdown state. |
I can take a look at this. |
After talking to @cijothomas and doing some of my own digging, it looks like #229 was the PR which introduced using I don't see why we cannot use the |
Thanks! Could you open a PR with the proposed changes for logs first, and then we can extend it to Traces. The gains would be more relevant for Spans as this bottleneck is faced twice - start, end of span. |
I am not sure we should be upgrading the |
The interface to an item shouldn't take an inner value since it's considered inner, this also allows for further optimizations in the future as it hides the complexity from the user. Relates open-telemetry#1209
The interface to an item shouldn't take an inner value since it's considered inner, this also allows for further optimizations in the future as it hides the complexity from the user. Relates open-telemetry#1209
The interface to an item shouldn't take an inner value since it's considered inner, this also allows for further optimizations in the future as it hides the complexity from the user. Rational: This removes exposing the inner which doesn't need to be provided outside of the class. The advantage of this approach is that it's a cleaner implementation. This also removes a weak reference upgrade from the hotpath since we need to have a strong reference in order to access the information. Relates open-telemetry#1209
The interface to an item shouldn't take an inner value since it's considered inner, this also allows for further optimizations in the future as it hides the complexity from the user. Rational: This removes exposing the inner which doesn't need to be provided outside of the class. The advantage of this approach is that it's a cleaner implementation. This also removes a weak reference upgrade from the hotpath since we need to have a strong reference in order to access the information. Relates open-telemetry#1209
Closing this issue for Logs. Will open a separate one for doing similar change for Traces. |
Noticed while stress testing Logs, but should be applicable to Tracing as well, as they follow the same pattern, where Logger (Tracer), holds a
Weak
ref to LoggerProvider (TracerProvider), and in hoth path (logger.EmitLog, span.Record), the weak reference to provider is upgraded to Arc, to obtain things like processor list, resource etc. from the provider. This weak -> Arc upgrade seems to be the bottleneck.Here's how I tested:
false
, which should make the throughput skyrocket as we don't have to do anything like creating LogRecord or invoke processors etc. This did increase the throughput to ~35M/sec, but I was expecting 200+M/sec, as I get similar throughout when using tracing+no-op-tracing-subscriber, without any OpenTelemetry component.The weak ref to provider was introduced for Tracing (and then followed for Logging), in this PR. Using Arc does mean that the Provider won't be dropped and shutdown won't be signaled until the last Logger is dropped, but that seems okay to me. In the case of Logging, Logger's are held only by the appenders (tokio-tracing subscribers etc), which, when dropped, will drop their Logger as well, allowing provider to be dropped....
I do not know if this has any further implications, especially for Tracer/Span case.
The (perf) issue is applicable to Span as well as this upgrade occurs (I think) twice - when span begins and when span ends...
Opening this issue to get feedback on this from experts and to check if this draft code is a reasonable direction to further explore.
The text was updated successfully, but these errors were encountered: