Any way to reduce performance overhead in DefaultSynchronousMetricStorage? #6601

wgy035 · 2024-07-24T09:37:00Z

Is your feature request related to a problem? Please describe.
While conducting a performance test, I discovered through flame graph analysis that DefaultSynchronousMetricStorage.record() causes extra performance overhead.

Is there any way to reduce performance overhead in DefaultSynchronousMetricStorage?
Describe the solution you'd like
1.The process of AdviceAttributesProcessor seems only remove some attributes from input and it cost a lot, could we skip the remove process?
2.CollectionAggregatorHandles used ConcurrentHashMap, the performance is suboptimal in high concurrency scenarios, consider using a Lock-Free Queue instead, such as Disruptor or RingBuffer.

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Add any other context or screenshots about the feature request here.

The text was updated successfully, but these errors were encountered:

jack-berg · 2024-07-24T18:10:59Z

The AdviceAttributesProcessor is involved when a measurement is recorded to an instrument which is using the (experimental) setAttributesAdvice(List<AttributeKey<?>> attributes) (e.g. ExtendedDoubleHistogramBuilder#setAttributesAdvice.

If setAttributesAdvice is not set, there is no AdviceAttributesProcessor and the overhead is reduced. I wrote a detailed blog post about the otel java metrics implementation, and suffice it to say a lot of performance engineering has been done and there isn't much room for improvement.

So if setAttributesAdvice is causing overhead, why is it being called. It looks like you're using the otel java agent. The setAttributesAdvice API serves a key use case for the agent in that it allows users to opt-in to additional attributes besides the default set which is recorded. For example, the http.server.request.duration metric defines a bunch of safe, low cardinality attributes to include on every measurement. But some users want to opt-in to additional higher cardinality attributes which are normally reserved for http spans. The otel java agent instrumentation records measurements with all the attributes (both high and low cardinality), and uses the setAttributesAdvice API to instruct the otel metrics SDK that only a subset of them should be retained by default. Users can override this by configuring views to select additional attributes.

But the consequence of this is that the otel metrics SDK has to do additional work for every measurement. Measurements come in with all attributes, and the SDK needs to trim that attributes set down to the subset retained for aggregation. This takes CPU and memory. There may be ways to optimize AdviceAttributesProcessor, but I suspect they'll be minor improvements.

trask · 2024-07-24T18:58:22Z

@wgy035 how does this compare to the overhead of everything under io/opentelemetry/javaagent/?

github-actions · 2024-07-31T19:24:10Z

This has been automatically marked as stale because it has been marked as needing author feedback and has not had any activity for 7 days. It will be closed if no further activity occurs within 7 days of this comment.

wgy035 · 2024-08-06T04:17:37Z

@jack-berg Thanks for your detailed response! I currently understand the purpose of AdviceAttributesProcessor, but in the case of java agent usage, I think that the choice of additional attributes should be defined in Instrumenter.

I've observed that otel java agent has already traversed all attributes in the process of OperationListener.onEnd before calling the SDK(e.g. HttpServerMetrics#onEnd), so I think introducing AdviceAttributesProcessor may be an extra operation and we can remove attributes that users don't need in OperationListener.onEnd, just as what is done in AdviceAttributesProcessor.

@trask In my performance test, it accounts for 9% of the total usage under io/opentelemetry/javaagent/

jack-berg · 2024-08-07T20:48:30Z

Opened #6629 as a minor optimization to exit early when advice doesn't end up removing any attributes. This still doesn't help the common case of when advice does result in attributes being removed.

jack-berg · 2024-09-18T20:09:22Z

Closing this after merging #6633. I know there is a another idea in the description to use a lock free alternative to ConcurrentHashMap, but that's a big enough topic that it should have its own issue. Please open a new issue if anyone wishes to discuss that.

wgy035 added the Feature Request Suggest an idea for this project label Jul 24, 2024

jack-berg added the needs author feedback Waiting for additional feedback from the author label Jul 24, 2024

github-actions bot added the Stale label Jul 31, 2024

github-actions bot removed the needs author feedback Waiting for additional feedback from the author label Aug 6, 2024

jack-berg removed the Stale label Aug 7, 2024

jack-berg mentioned this issue Aug 7, 2024

Avoid allocations when advice doesn't remove any attributes #6629

Merged

jack-berg mentioned this issue Aug 8, 2024

Optimize advice with FilteredAttributes #6633

Merged

jack-berg closed this as completed Sep 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Any way to reduce performance overhead in DefaultSynchronousMetricStorage? #6601

Any way to reduce performance overhead in DefaultSynchronousMetricStorage? #6601

wgy035 commented Jul 24, 2024

jack-berg commented Jul 24, 2024

trask commented Jul 24, 2024

github-actions bot commented Jul 31, 2024

wgy035 commented Aug 6, 2024

jack-berg commented Aug 7, 2024

jack-berg commented Sep 18, 2024

Any way to reduce performance overhead in DefaultSynchronousMetricStorage? #6601

Any way to reduce performance overhead in DefaultSynchronousMetricStorage? #6601

Comments

wgy035 commented Jul 24, 2024

jack-berg commented Jul 24, 2024

trask commented Jul 24, 2024

github-actions bot commented Jul 31, 2024

wgy035 commented Aug 6, 2024

jack-berg commented Aug 7, 2024

jack-berg commented Sep 18, 2024