Metric SDK - remove de-duplicating attributes for perf gain #1398

cijothomas · 2023-11-22T22:44:07Z

Fixes #1098. This adds to the gains achieved from #1379!

Similar to Tracing and Logs, this PR removes the attribute de-duplication from the SDK. Changelog is advising users to #1300, feedback from which can be used to evaluate if we need to bring this back as a opt-in feature in OTel SDK.

Changes

Remove de-deduplication of metric attributes from SDK.
Add test to validate the behavior and assert that it is a conscious choice.
Prometheus Exporter already handles this (when duplicates arise when sanitizing), but order is not guaranteed. Tests are modified to account for this.

Performance gains

20% gain for Counter.Add() hot path, when 4 attributes are used. More attributes will show more gains.
Numbers from my machine:

main branch

Counter_Add_Sorted      time:   [650.97 ns 659.10 ns 668.39 ns]
Found 6 outliers among 100 measurements (6.00%)
  2 (2.00%) high mild
  4 (4.00%) high severe

Counter_Add_Unsorted    time:   [645.21 ns 650.88 ns 656.56 ns]
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high severe

This PR

Counter_Add_Sorted      time:   [527.00 ns 531.73 ns 537.88 ns]
                        change: [-18.633% -15.611% -11.545%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 12 outliers among 100 measurements (12.00%)
  2 (2.00%) high mild
  10 (10.00%) high severe

Counter_Add_Unsorted    time:   [532.62 ns 537.23 ns 542.44 ns]
                        change: [-19.127% -17.803% -16.416%] (p = 0.00 < 0.05)
                        Performance has improved.

main branch
50% boost in the time needed for making AttributeSet from slice!

AttributeSet_without_duplicates
                        time:   [217.74 ns 220.07 ns 222.85 ns]
Found 6 outliers among 100 measurements (6.00%)
  3 (3.00%) high mild
  3 (3.00%) high severe

AttributeSet_with_duplicates
                        time:   [219.37 ns 229.60 ns 241.22 ns]
Found 15 outliers among 100 measurements (15.00%)
  2 (2.00%) high mild
  13 (13.00%) high severe

This PR:

AttributeSet_without_duplicates
                        time:   [99.515 ns 100.37 ns 101.33 ns]
                        change: [-54.598% -53.572% -52.608%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 10 outliers among 100 measurements (10.00%)
  3 (3.00%) high mild
  7 (7.00%) high severe

AttributeSet_with_duplicates
                        time:   [105.90 ns 107.11 ns 108.55 ns]
                        change: [-54.672% -52.847% -51.089%] (p = 0.00 < 0.05)
                        Performance has improved.

Stress Tests:
main branch

Number of threads: 4
Throughput: 7,660,800 iterations/sec
Throughput: 7,904,400 iterations/sec
Throughput: 7,685,400 iterations/sec
Throughput: 7,048,800 iterations/sec
Throughput: 7,251,000 iterations/sec

this PR

Number of threads: 4
Throughput: 8,148,800 iterations/sec
Throughput: 8,265,800 iterations/sec
Throughput: 8,422,600 iterations/sec
Throughput: 8,420,600 iterations/sec
Throughput: 8,473,200 iterations/sec

Stress tests use only 3 attributes, so the improvement seems smaller compared
to benchmarks which uses 4 attributes!

Merge requirement checklist

CONTRIBUTING guidelines followed
Unit tests added/updated (if applicable)
Appropriate CHANGELOG.md files updated for non-trivial, user-facing changes
Changes in public API reviewed (if applicable)

codecov · 2023-11-22T22:52:43Z

Codecov Report

Attention: 2 lines in your changes are missing coverage. Please review.

Comparison is base (fcd12eb) 57.3% compared to head (17ff310) 57.6%.

Files	Patch %	Lines
opentelemetry-sdk/src/metrics/mod.rs	96.9%	2 Missing ⚠️

Additional details and impacted files

@@           Coverage Diff           @@
##            main   #1398     +/-   ##
=======================================
+ Coverage   57.3%   57.6%   +0.2%     
=======================================
  Files        146     146             
  Lines      18179   18239     +60     
=======================================
+ Hits       10422   10508     +86     
+ Misses      7757    7731     -26

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

hdost · 2023-11-23T07:09:26Z

So going back to our conversation from before the gain here is not the 7x we saw. This is closer to a 6-8% improvement based on the stress test.

I guess my interpretation was that we'd look at bounded instruments and/or allowing users to pass in prebuilt AttributeSets.
If we're going to just break it this seems like we should put it behind a feature flag which was also something we discussed perhaps feature could be metrics-enable-deduplication.

hdost

See previous comment

cijothomas · 2023-11-24T18:54:01Z

So going back to our conversation from before the gain here is not the 7x we saw. This is closer to a 6-8% improvement based on the stress test.

There is no 7x gain with this PR. I am trying to understanding why did you expect 7x here? Apologies if I missed something obvious..
The perf gains from this change are shared in the PR description. If we have more attributes, there will be more gains, of course, but just picked one example (3 attributes for stress, 4 attributes for criterion) to show the gains!

we'd look at bounded instruments and/or allowing users to pass in prebuilt AttributeSets

Yes absolutely! That is a separate improvement, helping users who has key/values known ahead. This PR is not attempting that improvement.

If we're going to just break it this seems like we should put it behind a feature flag

If users have duplicate keys in their attributes, that is already an indication of something wrong with their instrumentation. I don't think its fair to ask every user to pay for the perf penality to dedup! I am totally open to bringing de-dup as an opt-in feature, which is why I has opened this issue to gather feedback.

hdost · 2023-11-25T11:33:37Z

TL;DR I think we should add the feature flag before merging this.

So going back to our conversation from before the gain here is not the 7x we saw. This is closer to a 6-8% improvement based on the stress test.

There is no 7x gain with this PR. I am trying to understanding why did you expect 7x here? Apologies if I missed something obvious..

My point was maybe not well made.
#1379 we saw significant improvement on the order of 6-7x improvement (from what I understand and doesn't break any previous functionality.
This PR breaks backwards compatibility and is not as significant improvement.

we'd look at bounded instruments and/or allowing users to pass in prebuilt AttributeSets

Yes absolutely! That is a separate improvement, helping users who has key/values known ahead. This PR is not attempting that improvement.

I think we'd be better off focusing on providing those to provide performance. And if we're going to break functionality I really think we need to be more thoughtful about it. Yes we want performance, but if we are constantly breaking things people have little incentive to move forward and with our latest release I don't think we've done a good job of providing a path forward.

This is a little wider point but it's relevant. I don't want to be the barrier to progress, but I also don't want no one to use this because they experience frustration through upgrades.

If we're going to just break it this seems like we should put it behind a feature flag

If users have duplicate keys in their attributes, that is already an indication of something wrong with their instrumentation. I don't think its fair to ask every user to pay for the perf penality to dedup! I am totally open to bringing de-dup as an opt-in feature, which is why I has opened this issue to gather feedback.

On the point above I think that we should have a feature flag on this to allow people to work with the old behavior. Feedback is great but we don't give people the option to try both ways as they would have to use an older version which might miss other internal improvements.

cijothomas · 2023-11-25T20:30:04Z

@hdost I'm not entirely clear on who might be affected by this change. The API itself hasn't been changed; it's more a shift in behavior from not deduplicating in the SDK to doing so in the Prometheus exporter. Is it common for anyone to intentionally send duplicate attributes, expecting the SDK to handle the deduplication? Considering there are no extensibility points like SpanProcessor/LogRecordProcessor, it seems unlikely that additional attributes could be added even inadvertently, leading to duplication. Thus, the individual emitting the measurement should have complete control and can take steps to avoid passing duplicate keys. If there are other scenarios I'm not considering, I'm open to understanding more about them. From what I've seen in OTel .NET, which has never performed deduplication either in the SDK or Prometheus Exporter, this issue of duplication hasn't been a point of concern.

I do understand the importance of being mindful with any changes that could potentially break existing workflows. However, in this instance, I'm not convinced that this change is of that nature. This is similar to the changes we've made previously with Logs and Traces. We didn't implement feature flags for traces or logs, and I'm not sure it's necessary for metrics either. I strongly believe that the default behavior should prioritize efficiency, and I fully support the idea of a feature flag or opt-in approach for the less efficient route involving deduplication, rather than defaulting to it.

Would you reconsider your objection to this PR?

hdost · 2023-11-27T10:12:33Z

TL;DR I think we should add the feature flag before merging this.

I strongly believe that the default behavior should prioritize efficiency, and I fully support the idea of a feature flag or opt-in approach for the less efficient route involving deduplication, rather than defaulting to it.

I guess we're on the same page that there should be a feature flag.

As I suggested it could be something like: metrics-enable-deduplication

KallDrexx · 2023-11-27T13:24:15Z

@hdost Just to clarify (cause it was easy to lose in all the noise in the other PR), #1379 had a 6x improvement in the stress test because of the precaching of the hash values. This precaching gave the stress test extremely high gains on some machines (but not all) because it brought the hash computation out of the section of code under a mutex lock, thus allowed for the costly hash to be calculated in parallel under load.

That optimization has been removed (to be opened in a new PR) but it's also a "best case" optimization that may not be realistic in applications that do more than just log metrics.

cijothomas · 2023-11-27T17:15:18Z

TL;DR I think we should add the feature flag before merging this.

I strongly believe that the default behavior should prioritize efficiency, and I fully support the idea of a feature flag or opt-in approach for the less efficient route involving deduplication, rather than defaulting to it.

I guess we're on the same page that there should be a feature flag.

As I suggested it could be something like: metrics-enable-deduplication

Not fully! I agree to add a feature flag, if we know users are affected, not right away (i.e in this PR). Generally, I do not prefer adding feature-flag/options without some evidence that it is even needed. Same approach was already followed for logs and traces.

jtescher · 2023-11-27T17:30:05Z

Looking at some context for the attribute collections part of the spec from this PR where it was introduced, it does seem like the intention is for deduplicaiton to happen if possible, we could maybe have a feature flag to opt out for perf gains rather than opting in?

cijothomas · 2023-11-27T18:04:35Z

Looking at some context for the attribute collections part of the spec from this PR where it was introduced, it does seem like the intention is for deduplicaiton to happen if possible, we could maybe have a feature flag to opt out for perf gains rather than opting in?

That is effectively penalizing almost all(?) users, by doing de-dup logic in the hot path! I really hope we agree to do the default as the fastest!

To unblock progress, are folks okay to observe feedback from #1300, and decide if we want to have a feature-flag, and determine if the feature-flag is enabled-by-default vs opt-in at the time we add feature-flag?

KallDrexx · 2023-11-27T18:21:22Z

Maybe it's prudent to have concrete idea of what the ramifications of not-deduplicating is? From my limited knowledge, there are two ramifications

You may end up with duplicated time series (e.g. {color: red, shape: circle, color:red} vs {color:red, shape: cicle} would be different time series).
Some exporters may fail (e.g. prometheus non-otlp).

What are the ultimate consequences of these?

There's always a balance between performance and creating the pit of success for users, and it's hard for me to give much of an opinion without how accidentally violating the duplicate key rules in this case actually works.

Fwiw, I don't have a problem enabling this to be opt-in as a "I am aware of how I build my attributes and explicitly know that providing duplicate attributes may cause bad data". I'm a big proponent of pit of success, and a feature flag to enable better performance without code changes doesn't seem too erroneous.

cijothomas · 2023-11-27T18:29:13Z

what the ramifications of not-deduplicating is

It'd depend on the backend! Prometheus handles de-dup itself by doing ";" separated values, as shown in #1379 (comment)

Metrics solutions inside my company (Microsoft) does overriding at ingestion.

There could be other backends which does not handle this well. Hoping to hear about them from #1300

KallDrexx · 2023-11-27T18:39:20Z

I guess my primary concerns is with {color: red, shape: circle, color:red} vs {color:red, shape: cicle}.

If we don't dedup, then these are stored in the ValueMap as two different attribute sets, which means a counter against one won't be applied to the other. This means you could have two different values in both attribute sets.

I assume the backends will automatically aggregate them into a single time series in that case?

cijothomas · 2023-11-28T01:29:29Z

I assume the backends will automatically aggregate them into a single time series in that case?

Again, this is backend specific.

cijothomas · 2023-12-07T18:54:19Z

Closing for now, as we are exploring other options, and will come back to this after #1421

cijothomas added 5 commits November 22, 2023 14:11

Remove metric attribute key de-duplication

3be5105

fmt

22e2c0d

clippy ifx

a6850e5

map is fine

32f26bd

add changelog

02a709c

cijothomas requested a review from a team November 22, 2023 22:44

cijothomas changed the title ~~Cijothomas/metric dedup removal~~ Metric SDK - remove de-duplicating attributes for perf gain Nov 22, 2023

hdost requested changes Nov 23, 2023

View reviewed changes

Merge branch 'main' into cijothomas/metric-dedup-removal

cdefcd2

Merge branch 'main' into cijothomas/metric-dedup-removal

17ff310

cijothomas closed this Dec 7, 2023

cijothomas deleted the cijothomas/metric-dedup-removal branch December 7, 2023 18:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Metric SDK - remove de-duplicating attributes for perf gain #1398

Metric SDK - remove de-duplicating attributes for perf gain #1398

cijothomas commented Nov 22, 2023 •

edited

Loading

codecov bot commented Nov 22, 2023 •

edited

Loading

hdost commented Nov 23, 2023 •

edited

Loading

hdost left a comment

cijothomas commented Nov 24, 2023

hdost commented Nov 25, 2023

cijothomas commented Nov 25, 2023

hdost commented Nov 27, 2023

KallDrexx commented Nov 27, 2023

cijothomas commented Nov 27, 2023

jtescher commented Nov 27, 2023

cijothomas commented Nov 27, 2023

KallDrexx commented Nov 27, 2023 •

edited

Loading

cijothomas commented Nov 27, 2023

KallDrexx commented Nov 27, 2023

cijothomas commented Nov 28, 2023

cijothomas commented Dec 7, 2023

Metric SDK - remove de-duplicating attributes for perf gain #1398

Metric SDK - remove de-duplicating attributes for perf gain #1398

Conversation

cijothomas commented Nov 22, 2023 • edited Loading

Changes

Performance gains

Merge requirement checklist

codecov bot commented Nov 22, 2023 • edited Loading

Codecov Report

hdost commented Nov 23, 2023 • edited Loading

hdost left a comment

Choose a reason for hiding this comment

cijothomas commented Nov 24, 2023

hdost commented Nov 25, 2023

cijothomas commented Nov 25, 2023

hdost commented Nov 27, 2023

KallDrexx commented Nov 27, 2023

cijothomas commented Nov 27, 2023

jtescher commented Nov 27, 2023

cijothomas commented Nov 27, 2023

KallDrexx commented Nov 27, 2023 • edited Loading

cijothomas commented Nov 27, 2023

KallDrexx commented Nov 27, 2023

cijothomas commented Nov 28, 2023

cijothomas commented Dec 7, 2023

cijothomas commented Nov 22, 2023 •

edited

Loading

codecov bot commented Nov 22, 2023 •

edited

Loading

hdost commented Nov 23, 2023 •

edited

Loading

KallDrexx commented Nov 27, 2023 •

edited

Loading