-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ThreadLocal.asContextElement may not be cleaned up when used with Dispatchers.Main.immediate #4121
Comments
Non-Android reproducer with import kotlinx.coroutines.*
import kotlinx.coroutines.flow.MutableStateFlow
import kotlin.time.Duration.Companion.seconds
fun main() {
runBlocking {
val threadLocal = ThreadLocal<Int>()
val stateFlow = MutableStateFlow<Int>(0)
launch(Dispatchers.Unconfined) {
stateFlow.collect {
println("coroutine A: ${threadLocal.get()}")
}
}
launch(threadLocal.asContextElement(5)) {
delay(1.seconds)
println("before emit")
stateFlow.emit(1)
println("after emit")
}
}
} Similar example using undispatched starting strategy: import kotlinx.coroutines.*
import kotlin.time.Duration.Companion.seconds
fun main() {
runBlocking {
val threadLocal = ThreadLocal<Int>()
withContext(threadLocal.asContextElement(5)) {
GlobalScope.launch(Dispatchers.Default, start = CoroutineStart.UNDISPATCHED) {
println("coroutine A: ${threadLocal.get()}")
delay(1.seconds)
println("coroutine A: ${threadLocal.get()}")
}.join()
}
}
} Why do you consider this a problem? |
yes, but isn't the context supposed to be restored before switching the scopes? Like in the example the thread local that set in one coroutine scope should be restored before collector handles emission in another scope? |
Restored—yes. In the example, the coroutine you're entering doesn't have the context element, so this line in the docs (https://kotlinlang.org/api/kotlinx.coroutines/kotlinx-coroutines-core/kotlinx.coroutines/as-context-element.html) applies:
|
@dkhalanskyjb in our case it is injecting a logger interceptor into a coroutine. We have this logger installed globally:
So any call to Then it may be used like this:
We think it is a problem as Class A may not know about a CoroutineLoggingInterceptor at all, and it is not linked to class B directly in any way, but its logs may leak into coroutine B interceptor. And moreover the behavior is tied to dispatcher (unlike Dispatchers.Unconfined, Dispatchers.Main.immediate is a very common in android and a part of androidx core libraries as a default dispatcher) which should control execution but not a set of context elements. And the fact it does work if the coroutine A has this context element (so there is a way to re-set/clear it regardless it is executed in the same call frame), makes me think it shouldn't be an exception if it is missing. In our case there was an additional issue because |
If I understood the code correctly, it's calling thread-dependent functions from the CoroutineLoggingInterceptor, so it should know about the consequences.
If you call
Dispatchers don't control the set of context elements, coroutine scopes do.
What do you propose to be the behavior in this case? Imagine you're the coroutine
|
It is hidden behind abstractions, the call
I do not agree, coroutine B did nothing stupid in this case. It injected the log collector as its own context element. It shouldn't care about context element leak to other coroutines. It didn't modify thread/thread local directly, it didn't influence coroutine A environment directly or bypassing coroutines mechanism. Whose fault is it in this case? Probably CoroutineLoggingInterceptor implementation, but to me this component did nothing wrong either.. It does "if there is a context element, do this. If not do nothing". So to me the problem is in the mechanism of restoring context and undefined behavior (which is seems to be documented yes, but it doesn't mean to me there is no room for improvement).
Initially I thought about just restoring the value (second option), but your example makes sense The only solution I see right now is to extend Unfortunately rn I don't have a well thought proposal how to fix/improve that |
I'm not saying that
I still don't understand this point. I agree that it may be difficult to know about the consequences, I just don't see any way around this. If something deep inside your abstractions relies on some code being invoked from a particular thread / a thread in a particular state / some global mutable variable / some server config / something else in order to function properly, the whole abstraction all the way up gets poisoned and relies on that, too. Let's look at the example with no /** Update the system state to [value]. */
fun callback(value: Int) {
Timber.tag("TAG").i("Collected $value")
}
// in some other file, almost the same coroutine B:
viewLifecycleOwner.lifecycleScope.launch(coroutineLoggingInterceptor.plant(actualTree)) {
callback(1)
} We were writing Whose fault is that?
The major difference in practice is that, with |
My first impression was it is coroutines fault just because it looks like a violation of coroutine context encapsulation/isolation (a context element injected into the first scope indirectly affected another scope). It would be definitely weird if calling Sometimes it is mentioned in the docs like here (so it is our fault that we didn't read it precisely and match it with our case) sometimes it doesn't - for example MDCContext should be affected by this issue as well because it is implemented as ThreadContextElement. Its documentation covers other valid corner cases, but says nothing about this undefined behavior, so it is possible MDC values from one scope would be logged in another scope causing data leak. Probably the real problem is that it is impossible to get coroutineContext (or null if it is not a coroutine) from non-suspending function. In our case it would be simply
I'm not sure how feasible or not would be to add that function. Probably not since coroutineContext/currentCoroutineContext() implementation is intrinsic.
If a threadlocal is initialized with initial lamda, a setInitialValue/remove may be called as a cleanup (otherwise it will be just nullified). In this case the subsequent invocation of |
Would #2932 solve the issue? You'd be able to inject your
Good point, we should fix that. |
I am not sure if it is covered by #2930, but to me steps look different
Having the following code
It logs
So the coroutine A, that doesn't have a threadLocal as a ContextElement, has an access to it just because of Dispatchers.Main.immediate optimization (the code was invoked directly without dispatching, and somehow context element wasn't cleaned up).
If the first coroutine is launched with
Dispatchers.Main
(not immediate) the issue disappears.The reproducer is for android project (can be placed in
onViewCreated
in an empty fragment) just because JVM has no direct dispatcher.The text was updated successfully, but these errors were encountered: