Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove the recommendation to not synchronize access to Config.disabled #4310

Merged
merged 4 commits into from
Dec 5, 2024

Conversation

pellared
Copy link
Member

@pellared pellared commented Nov 26, 2024

Why

It is not necessary for implementations to ensure that changes to disabled are immediately visible to callers of Enabled.

I am OK with this statement. The implementation can use things like Volatile.Read or volatile in .NET so that at some point a modified disabled value will be returned.

I.e. atomic, volatile, synchronized, or equivalent memory semantics to avoid stale reads are discouraged to prioritize performance over immediate consistency.

I completely disagree with this statement as it encourages that we should never synchronize the memory which leads to data races. If we do not have any mean to synchronize then the user may NEVER get the updated value.

E.g. the program may use the memory from CPU cache and never invalidate it; the actual behavior depends on many things e.g. CPU architecture. It is not a theoretical issue, I was experiencing and fixing such bugs. You can also check: https://www.albahari.com/threading/part4.aspx#_Nonblocking_Synchronization (see: Do We Really Need Locks and Barriers?)

"Not immediately" vs "never" is a big difference.

Lack of any synchronization can for sure be an issue for Java and Go:

Besides, the Go race detector would bail if we would not synchronize access e.g. using atomic.

What

Remove an unsafe recommendation to not synchronize access to Config.disabled.

@pellared pellared marked this pull request as ready for review November 26, 2024 22:42
@pellared pellared requested review from a team as code owners November 26, 2024 22:42
@pellared pellared changed the title Remove the recommendation to not synchronize access to Config.enabled Remove the recommendation to not synchronize access to Config.disabled Nov 26, 2024
@pellared
Copy link
Member Author

@jack-berg, PTAL

@jack-berg
Copy link
Member

I'm not strongly opposed to this. I do think implementations should be free to choose implementations which work for them. The important thing I was trying to accomplish with this language was preventing the introduction of the ability to disable scopes from impacting the performance of recording measurements. Volatile and other synchronized memory semantics aren't free, and updating whether or not a particular scope (and its instruments) will be a relatively niche situation. I wanted to give a hint to implementers that this niche edge case shouldn't be lower priority than the common case.

The actual perf impact of volatile / synchronized is language and system dependent.

If we do not have any mean to synchronize then the user may NEVER get the updated value.

My understanding is that this is system dependent, but that in most cases the update would eventually be seen. If its rare for users to be using a system where the updates are never seen, and its also rare to be dynamically updating SDK settings, then it will be exceptionally rare for both cases to be true. I didn't want to the common case to be impacted by edge case.

@pellared
Copy link
Member Author

pellared commented Nov 28, 2024

updating whether or not a particular scope (and its instruments) will be a relatively niche situation

If this is a niche situation then the alternative is to get rid of the functionality to update the config (instead of providing a functionality that does not give any guarantees on their actual effect).

it will be exceptionally rare for both cases to be true

If we do want to support updating config then we have to guarantee at least eventual consistency. Otherwise it is a bug in the design as people may lose telemetry they want to capture because of this.

@jackshirazi
Copy link

jackshirazi commented Dec 2, 2024

For Java, I don't think it matters either way. In terms of pros and cons (pro being don't add memory barriers), the practical situation is that

  • PRO: Applications that tightly control visibility such that there is a significant prospect they won't cross a memory barrier in reasonable time are also the ones that cannot accept the overhead of OpenTelemetry at all (they typically have internally created tightly controlled observability)
  • PRO: volatile fields do have overheads and can cause CPU-cache thrashing
  • PRO: Almost every application that uses OpenTelemetry will frequently cross memory barriers, so will be eventually consistent
  • CON: That "eventual" may be longer than desired
  • CON: Tests in Java show the overhead of volatile is quite low, there was one test (sadly I can't find the reference) which modified the JVM to make every field volatile, that found for a variety of apps/benchmarks they ran at 70% of the unmodified JVM speed

I'd approach this as leave it eventual until someone complains with an actual performance issue, then fix that (eg with an opt-in memory barrier)

@jack-berg
Copy link
Member

If this is a niche situation then the alternative is to get rid of the functionality to update the config

I think closing the door on dynamic config would be a mistake. We need to support certain niche features for the overall ecosystem to be considered complete. Supporting pluggable id generators for tracing, and bound instruments for metrics are similar examples of features that won't be used by the average user, but are still important because of the negative perception of not having them. Need to find away to support these features while prioritizing the common case.

If we do want to support updating config then we have to guarantee at least eventual consistency.

I agree.

I'd approach this as leave it eventual until someone complains with an actual performance issue, then fix that (eg with an opt-in memory barrier)

👍

@pellared
Copy link
Member Author

pellared commented Dec 2, 2024

Currently the recommendation has no eventual consistency. There is no synchronization.

Eventual Consistency is a guarantee that when an update is made in a system, that update will eventually be reflected.

Thus my PR. I only purpose remove on sentence. I am OK with eventual consistency but I complain about lack of any synchronization that is necessary to achieve eventual consistency.

Side note. The overhead of volatile is also hardware specific.

@jack-berg
Copy link
Member

jack-berg commented Dec 2, 2024

Currently the recommendation has no eventual consistency. There is no synchronization.

Fair enough. Synchronization is in fact the word that is used to communicate coordination between threads, and so by explicitly discouraging all synchronization, its opening the door and you could argue prohibiting updates from ever being seen. That's obviously not the intent, and so I agree we should improve.

Copy link
Member

@reyang reyang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change itself makes sense to me. TBH I haven't spent much time in the configuration area, so I don't have the full context.

@carlosalberto
Copy link
Contributor

Overall this is a good "trade-off".

@reyang reyang merged commit 19d6783 into open-telemetry:main Dec 5, 2024
6 checks passed
@pellared pellared deleted the fix-sync-config branch December 6, 2024 20:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants