Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metric SDK - remove de-duplicating attributes for perf gain #1398

Conversation

cijothomas
Copy link
Member

@cijothomas cijothomas commented Nov 22, 2023

Fixes #1098. This adds to the gains achieved from #1379!

Similar to Tracing and Logs, this PR removes the attribute de-duplication from the SDK. Changelog is advising users to #1300, feedback from which can be used to evaluate if we need to bring this back as a opt-in feature in OTel SDK.

Changes

Remove de-deduplication of metric attributes from SDK.
Add test to validate the behavior and assert that it is a conscious choice.
Prometheus Exporter already handles this (when duplicates arise when sanitizing), but order is not guaranteed. Tests are modified to account for this.

Performance gains

20% gain for Counter.Add() hot path, when 4 attributes are used. More attributes will show more gains.
Numbers from my machine:

main branch

Counter_Add_Sorted      time:   [650.97 ns 659.10 ns 668.39 ns]
Found 6 outliers among 100 measurements (6.00%)
  2 (2.00%) high mild
  4 (4.00%) high severe

Counter_Add_Unsorted    time:   [645.21 ns 650.88 ns 656.56 ns]
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high severe

This PR

Counter_Add_Sorted      time:   [527.00 ns 531.73 ns 537.88 ns]
                        change: [-18.633% -15.611% -11.545%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 12 outliers among 100 measurements (12.00%)
  2 (2.00%) high mild
  10 (10.00%) high severe

Counter_Add_Unsorted    time:   [532.62 ns 537.23 ns 542.44 ns]
                        change: [-19.127% -17.803% -16.416%] (p = 0.00 < 0.05)
                        Performance has improved.

main branch
50% boost in the time needed for making AttributeSet from slice!

AttributeSet_without_duplicates
                        time:   [217.74 ns 220.07 ns 222.85 ns]
Found 6 outliers among 100 measurements (6.00%)
  3 (3.00%) high mild
  3 (3.00%) high severe

AttributeSet_with_duplicates
                        time:   [219.37 ns 229.60 ns 241.22 ns]
Found 15 outliers among 100 measurements (15.00%)
  2 (2.00%) high mild
  13 (13.00%) high severe

This PR:

AttributeSet_without_duplicates
                        time:   [99.515 ns 100.37 ns 101.33 ns]
                        change: [-54.598% -53.572% -52.608%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 10 outliers among 100 measurements (10.00%)
  3 (3.00%) high mild
  7 (7.00%) high severe

AttributeSet_with_duplicates
                        time:   [105.90 ns 107.11 ns 108.55 ns]
                        change: [-54.672% -52.847% -51.089%] (p = 0.00 < 0.05)
                        Performance has improved.

Stress Tests:
main branch

Number of threads: 4
Throughput: 7,660,800 iterations/sec
Throughput: 7,904,400 iterations/sec
Throughput: 7,685,400 iterations/sec
Throughput: 7,048,800 iterations/sec
Throughput: 7,251,000 iterations/sec

this PR

Number of threads: 4
Throughput: 8,148,800 iterations/sec
Throughput: 8,265,800 iterations/sec
Throughput: 8,422,600 iterations/sec
Throughput: 8,420,600 iterations/sec
Throughput: 8,473,200 iterations/sec

Stress tests use only 3 attributes, so the improvement seems smaller compared
to benchmarks which uses 4 attributes!

Merge requirement checklist

  • CONTRIBUTING guidelines followed
  • Unit tests added/updated (if applicable)
  • Appropriate CHANGELOG.md files updated for non-trivial, user-facing changes
  • Changes in public API reviewed (if applicable)

@cijothomas cijothomas requested a review from a team November 22, 2023 22:44
@cijothomas cijothomas changed the title Cijothomas/metric dedup removal Metric SDK - remove de-duplicating attributes for perf gain Nov 22, 2023
Copy link

codecov bot commented Nov 22, 2023

Codecov Report

Attention: 2 lines in your changes are missing coverage. Please review.

Comparison is base (fcd12eb) 57.3% compared to head (17ff310) 57.6%.

Files Patch % Lines
opentelemetry-sdk/src/metrics/mod.rs 96.9% 2 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##            main   #1398     +/-   ##
=======================================
+ Coverage   57.3%   57.6%   +0.2%     
=======================================
  Files        146     146             
  Lines      18179   18239     +60     
=======================================
+ Hits       10422   10508     +86     
+ Misses      7757    7731     -26     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@hdost
Copy link
Contributor

hdost commented Nov 23, 2023

So going back to our conversation from before the gain here is not the 7x we saw. This is closer to a 6-8% improvement based on the stress test.

I guess my interpretation was that we'd look at bounded instruments and/or allowing users to pass in prebuilt AttributeSets.
If we're going to just break it this seems like we should put it behind a feature flag which was also something we discussed perhaps feature could be metrics-enable-deduplication.

Copy link
Contributor

@hdost hdost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See previous comment

@cijothomas
Copy link
Member Author

So going back to our conversation from before the gain here is not the 7x we saw. This is closer to a 6-8% improvement based on the stress test.

There is no 7x gain with this PR. I am trying to understanding why did you expect 7x here? Apologies if I missed something obvious..
The perf gains from this change are shared in the PR description. If we have more attributes, there will be more gains, of course, but just picked one example (3 attributes for stress, 4 attributes for criterion) to show the gains!

we'd look at bounded instruments and/or allowing users to pass in prebuilt AttributeSets

Yes absolutely! That is a separate improvement, helping users who has key/values known ahead. This PR is not attempting that improvement.

If we're going to just break it this seems like we should put it behind a feature flag

If users have duplicate keys in their attributes, that is already an indication of something wrong with their instrumentation. I don't think its fair to ask every user to pay for the perf penality to dedup! I am totally open to bringing de-dup as an opt-in feature, which is why I has opened this issue to gather feedback.

@hdost
Copy link
Contributor

hdost commented Nov 25, 2023

TL;DR I think we should add the feature flag before merging this.

So going back to our conversation from before the gain here is not the 7x we saw. This is closer to a 6-8% improvement based on the stress test.

There is no 7x gain with this PR. I am trying to understanding why did you expect 7x here? Apologies if I missed something obvious..

My point was maybe not well made.
#1379 we saw significant improvement on the order of 6-7x improvement (from what I understand and doesn't break any previous functionality.
This PR breaks backwards compatibility and is not as significant improvement.

we'd look at bounded instruments and/or allowing users to pass in prebuilt AttributeSets

Yes absolutely! That is a separate improvement, helping users who has key/values known ahead. This PR is not attempting that improvement.

I think we'd be better off focusing on providing those to provide performance. And if we're going to break functionality I really think we need to be more thoughtful about it. Yes we want performance, but if we are constantly breaking things people have little incentive to move forward and with our latest release I don't think we've done a good job of providing a path forward.

This is a little wider point but it's relevant. I don't want to be the barrier to progress, but I also don't want no one to use this because they experience frustration through upgrades.

If we're going to just break it this seems like we should put it behind a feature flag

If users have duplicate keys in their attributes, that is already an indication of something wrong with their instrumentation. I don't think its fair to ask every user to pay for the perf penality to dedup! I am totally open to bringing de-dup as an opt-in feature, which is why I has opened this issue to gather feedback.

On the point above I think that we should have a feature flag on this to allow people to work with the old behavior. Feedback is great but we don't give people the option to try both ways as they would have to use an older version which might miss other internal improvements.

@cijothomas
Copy link
Member Author

@hdost I'm not entirely clear on who might be affected by this change. The API itself hasn't been changed; it's more a shift in behavior from not deduplicating in the SDK to doing so in the Prometheus exporter. Is it common for anyone to intentionally send duplicate attributes, expecting the SDK to handle the deduplication? Considering there are no extensibility points like SpanProcessor/LogRecordProcessor, it seems unlikely that additional attributes could be added even inadvertently, leading to duplication. Thus, the individual emitting the measurement should have complete control and can take steps to avoid passing duplicate keys. If there are other scenarios I'm not considering, I'm open to understanding more about them. From what I've seen in OTel .NET, which has never performed deduplication either in the SDK or Prometheus Exporter, this issue of duplication hasn't been a point of concern.

I do understand the importance of being mindful with any changes that could potentially break existing workflows. However, in this instance, I'm not convinced that this change is of that nature. This is similar to the changes we've made previously with Logs and Traces. We didn't implement feature flags for traces or logs, and I'm not sure it's necessary for metrics either. I strongly believe that the default behavior should prioritize efficiency, and I fully support the idea of a feature flag or opt-in approach for the less efficient route involving deduplication, rather than defaulting to it.

Would you reconsider your objection to this PR?

@hdost
Copy link
Contributor

hdost commented Nov 27, 2023

TL;DR I think we should add the feature flag before merging this.

I strongly believe that the default behavior should prioritize efficiency, and I fully support the idea of a feature flag or opt-in approach for the less efficient route involving deduplication, rather than defaulting to it.

I guess we're on the same page that there should be a feature flag.

As I suggested it could be something like: metrics-enable-deduplication

@KallDrexx
Copy link
Contributor

@hdost Just to clarify (cause it was easy to lose in all the noise in the other PR), #1379 had a 6x improvement in the stress test because of the precaching of the hash values. This precaching gave the stress test extremely high gains on some machines (but not all) because it brought the hash computation out of the section of code under a mutex lock, thus allowed for the costly hash to be calculated in parallel under load.

That optimization has been removed (to be opened in a new PR) but it's also a "best case" optimization that may not be realistic in applications that do more than just log metrics.

@cijothomas
Copy link
Member Author

TL;DR I think we should add the feature flag before merging this.

I strongly believe that the default behavior should prioritize efficiency, and I fully support the idea of a feature flag or opt-in approach for the less efficient route involving deduplication, rather than defaulting to it.

I guess we're on the same page that there should be a feature flag.

As I suggested it could be something like: metrics-enable-deduplication

Not fully! I agree to add a feature flag, if we know users are affected, not right away (i.e in this PR). Generally, I do not prefer adding feature-flag/options without some evidence that it is even needed. Same approach was already followed for logs and traces.

@jtescher
Copy link
Member

Looking at some context for the attribute collections part of the spec from this PR where it was introduced, it does seem like the intention is for deduplicaiton to happen if possible, we could maybe have a feature flag to opt out for perf gains rather than opting in?

@cijothomas
Copy link
Member Author

Looking at some context for the attribute collections part of the spec from this PR where it was introduced, it does seem like the intention is for deduplicaiton to happen if possible, we could maybe have a feature flag to opt out for perf gains rather than opting in?

That is effectively penalizing almost all(?) users, by doing de-dup logic in the hot path! I really hope we agree to do the default as the fastest!

To unblock progress, are folks okay to observe feedback from #1300, and decide if we want to have a feature-flag, and determine if the feature-flag is enabled-by-default vs opt-in at the time we add feature-flag?

@KallDrexx
Copy link
Contributor

KallDrexx commented Nov 27, 2023

Maybe it's prudent to have concrete idea of what the ramifications of not-deduplicating is? From my limited knowledge, there are two ramifications

  1. You may end up with duplicated time series (e.g. {color: red, shape: circle, color:red} vs {color:red, shape: cicle} would be different time series).
  2. Some exporters may fail (e.g. prometheus non-otlp).

What are the ultimate consequences of these?

There's always a balance between performance and creating the pit of success for users, and it's hard for me to give much of an opinion without how accidentally violating the duplicate key rules in this case actually works.

Fwiw, I don't have a problem enabling this to be opt-in as a "I am aware of how I build my attributes and explicitly know that providing duplicate attributes may cause bad data". I'm a big proponent of pit of success, and a feature flag to enable better performance without code changes doesn't seem too erroneous.

@cijothomas
Copy link
Member Author

what the ramifications of not-deduplicating is

It'd depend on the backend! Prometheus handles de-dup itself by doing ";" separated values, as shown in #1379 (comment)

Metrics solutions inside my company (Microsoft) does overriding at ingestion.

There could be other backends which does not handle this well. Hoping to hear about them from #1300

@KallDrexx
Copy link
Contributor

I guess my primary concerns is with {color: red, shape: circle, color:red} vs {color:red, shape: cicle}.

If we don't dedup, then these are stored in the ValueMap as two different attribute sets, which means a counter against one won't be applied to the other. This means you could have two different values in both attribute sets.

I assume the backends will automatically aggregate them into a single time series in that case?

@cijothomas
Copy link
Member Author

I assume the backends will automatically aggregate them into a single time series in that case?

Again, this is backend specific.

@cijothomas
Copy link
Member Author

Closing for now, as we are exploring other options, and will come back to this after #1421

@cijothomas cijothomas closed this Dec 7, 2023
@cijothomas cijothomas deleted the cijothomas/metric-dedup-removal branch December 7, 2023 18:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Remove de-duping of attribute keys
4 participants