-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix race conditions in CurrentValueRelay #3447
Conversation
Hey @kabiroberai :) awesome work! Thanks for fixing this. |
appreciate the vote of confidence @iampatbrown! re locking As for the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
We're still seeing crashes due to what look like data races in `DemandBuffer`. Here, I'm making fixes based on what appears to have helped in pointfreeco#3447: In `buffer(value:)`: * Check if the demand is unlimited and, if so, unlock before calling `subscriber.receive(value)`. * If the demand is not unlimited, append the value to the buffer and call `flush()` after unlocking. In `flush(adding:)`: * Process values in a loop, acquiring and releasing the lock around shared state access. * Unlock before calling `subscriber.receive(value)` to avoid holding the lock during subscriber calls. * After sending a value, lock again to update `demandState.requested` with any additional demand returned by the subscriber. * Ensure that `subscriber.receive(completion:)` is called outside the lock. Also add a new `DemandBufferTests` test file. I added some tests that I was hoping would fail without the above changes, but they all pass both before and after, even with ThreadSanitizer enabled.
As it's currently written,
CurrentValueRelay
seems to have several race conditions. These are infrequent enough to make them rather elusive in typical usage, but we believe that these are the root cause behind 0.X% of Ramp iOS sessions crashing.This PR includes a test case that fails on
main
, as well as a fix that solves it, but is slow. It's probably worth you folks writing a fix yourself (without needing the lock to be recursive) to resolve the issue, but I hope that the test case helps. FWIW you should also be able to trip up TSAN when running the failing test case pre-fix; it catches a number of the ways in which the race can be triggered.