Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AtomicFu QoS Problems on iOS #462

Open
stefanhaustein opened this issue Jul 31, 2024 · 1 comment
Open

AtomicFu QoS Problems on iOS #462

stefanhaustein opened this issue Jul 31, 2024 · 1 comment

Comments

@stefanhaustein
Copy link

stefanhaustein commented Jul 31, 2024

Summary

On Darwin (iOS) the kernel must support a lock to handle QoS properly, and a custom built lock using atomics – such as the one in Synchronized.kt – won't fit that model (atomics do not handle QoS). The problem is that priorities are strictly enforced, so priority inversion can easily lead to severe problems (see background section below).

While the implementation quickly does fall back to a POSIX mutex, it's still problematic. Consider the following scenario:

  • T1 (low prio) takes a lock with a single CAS. There doesn't exist POSIX mutex at all
  • T2 (high prio) tries to acquire a lock. It's already locked, so T2 promotes internal locking mechanism to POSIX mutex and invokes wait on its conditional variable. (Note: T1 never observed a POSIX lock yet! It will observe it and notify waiters during unlock)
  • T3 (medium prio) does its job, explicitly taking time share from T1. T2 makes no progress

Proposed Fix

Dispatch queues and iOS XPC have sophisticated QoS support, but those are not suitable replacements for the locking in synchronized.kt.

The two documented locks that offer QoS support by recording ownership for the kernel are os_unfair_lock and pthread_mutex. Native iOS code will generally select between these two depending on the particular requirements of the use case (recursion, fairness, etc.).

For a general purpose lock, pthread_mutex is the reasonable choice. This is used in Android already in Lock.native.kt.

Unfortunately, it doesn't seem feasible to address this in the existing "general" native actual of SynchronizedObject.kt, but it should be feasible to add a MacOS / iOS compatible version that respects the actual.

Another issue might be potential similar locks in the coroutine infrastructure implementation.

Alternatives Considered

Option 1: Use the existing implementation, but skip THIN

This would basically allocate the lock on demand only, addressing the "use the lock pool" request on the pull request.

Risks / Problems

  • We'd need to prove that the while loops in the implementation don't effectively constitute an unbounded spin lock.

Option 2: Add QoS Donation to the existing implementation

At the point where the lock is upgraded from THIN to FAT, we know the thread holding the THIN lock and could donate QoS to it.

Risks / Problems

  • We'd need to prove that the while loops in the implementation don't effectively constitute an unbounded spin lock.
  • This seems much more complicated than the existing pull request – we can re-consider this if we run into actual real world problems.

Background

iOS resolves priority inversions between threads by temporarily promoting the QoS of the lower priority thread to the same level as the higher priority thread being blocked. In order for this promotion to occur, you must use one of the Apple-supplied lock mechanisms that support QoS donation.

Not all locks available in Apple APIs support QoS. For example, spinlocks, semaphores, and condition locks (including NSCondition) do not. Apple has aggressively deprecated their spinlock APIs.

Unfortunately, while the QoS-compatibility of locks has evolved to "common knowledge" in iOS dev, there is no authoritative Apple documentation.

Some public discussions of locks and QoS:

The takeaway of these discussions is that it is possible for a thread or serial queue (these are not exactly the same thing) to block for too long, or even indefinitely, on a lock that does not perform QoS promotion. The most problematic case is one where the low-priority thread holds the lock, but is never serviced because there are always higher QoS threads ready to run but blocking on the lock. Spinlocking is apparently particularly bad at triggering this case.

While deadlocks are probably rare in practice, we have evidence from Apple and other sources that some of our apps exhibit priority inversions when using inappropriate locks.

In order for KMP to generally interop with native iOS code executing at multiple QoS levels, it will need to use locking mechanisms compatible with QoS promotion.

@stefanhaustein
Copy link
Author

Update:

I have updated the implementation to be cheap when there is no contention (by switching to qos donation) and now also changed it to use a pool for native objects: https://gist.github.com/stefanhaustein/883b07e0e6c0ef17eabede8ba85e220a

This work was done in the context of compose-multiplatform-core -- in order to be able to run their benchmarks. There were also some build setup changes in the meantime, breaking the old PR.

So the next (hopefully quick) steps will be

  • Figure out how to integrate this into the build again
  • Small adjustments from compose-multiplatform-core back to atomicfu
  • New PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant