-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow Ffi calls to be marked as potentially blocking / exiting the isolate. #51261
Comments
The underlying issue is that new space (which we require for bump-allocation) doesn't scale with number of threads. The fact that this limitation would surface to the FFI API does seem a bit iffy. We could device a scheme where FFI calls will give up their TLAB on transitions to C and re-acquire on the way back and limit the number of outstanding TLABs instead of number of active isolates. Though that would make transitions more heavyweight, would make returning from C (as well as Dart C API calls) possibly blocking for arbitrary amount of time. Seems less than ideal. /cc @rmacnak-google |
I think this can be for free for the uncontented case: What we could do is when a new mutator wants to enter the isolate and the limit has been reached, we can check if any existing mutators are in an ffi-exited safepoint state, CAS its safepoint state to one meaning it has been kicked out, causing the safepoint transition on the ffi-return to hit the slow path, and take its TLAB away. The safepoint transition slow path then has a new check if it needs to wait on the mutator count to re-enter as a mutator. |
It would still be compatible with Dart's semantics of synchronous code on an isolate running to completion before any other code is run on that isolate. However, it would change the scheduling which isolate runs when we have exhausted the max number of mutators in an isolate group, that might be surprising. Do we have some kind of scheduling logic for that? @mkustermann |
That's an interesting idea. I'm a little worried that doing this blindly can lead to situations where e.g. Flutter UI isolate does a FFI call, then another thread kicking the UI isolate out. When the FFI call on UI isolate returns it will take the slow path and block (which could freeze flutter UI). This can also happen to some extent today as well - but only at event loop boundary (e.g. Flutter UI isolate is idle, N threads enter isolate group and then flutter UI isolate cannot enter anymore but has to wait). If one mutator has been kicked out and returns from ffi call then in the slow path it should be allowed to kick out another thread if it's in a ffi call. That would mean the system would work flawlessly irrespective of number of threads - as long as there are not more than N threads executing Dart code concurrently (which may be an ok restriction as all dart code being executed will either go back to event loop or do ffi call eventually which are yield poitns). Though it will require some synchronization on both sides:
|
Hi. I have a complaint about this. If we're going to expose to FFI developers to those kinds of Isolate details, why we can't have some way for native code to at least return (synchronously) Dart objects that can be created through the Dart_CObject struct? |
For applications that want to have arbitrary number of isolates call into native code that may be blocking, we expose the API functions that allows those native threads to exit an isolate before running long/blocking code. Without the ability to exit/re-enter isolate, one may experience deadlocks as we have a fixed limit on the number of concurrently executing isolates atm. In the longer term we may find a way to do this automatically with low overhead, see [0]. But since those API functions are quite stable and we already expose e.g. `Dart_{Enter,Exit}Scope`, I don't see a reason not to expose `Dart_{Enter,Exit}Isolate`. [0] Issue #51261 Issue #51254 TEST=ffi{,_2}/dl_api_exit_enter_isolate_test Change-Id: I91c772ca962fddb87919663fea07939a498fa205 Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/292722 Commit-Queue: Martin Kustermann <[email protected]> Reviewed-by: Daco Harkes <[email protected]> Reviewed-by: Ryan Macnak <[email protected]>
This reverts commit a251281. Reason for revert: FFI tests fail to link on Windows, fail to load on product-mode Android Original change's description: > [vm] Expose Dart_{CurrentIsolate,ExitIsolate,EnterIsolate} > > For applications that want to have arbitrary number of isolates call > into native code that may be blocking, we expose the API functions that > allows those native threads to exit an isolate before running > long/blocking code. > > Without the ability to exit/re-enter isolate, one may experience > deadlocks as we have a fixed limit on the number of concurrently > executing isolates atm. > > In the longer term we may find a way to do this automatically > with low overhead, see [0]. But since those API functions are quite > stable and we already expose e.g. `Dart_{Enter,Exit}Scope`, I don't > see a reason not to expose `Dart_{Enter,Exit}Isolate`. > > [0] Issue #51261 > > Issue #51254 > > TEST=ffi{,_2}/dl_api_exit_enter_isolate_test > > Change-Id: I91c772ca962fddb87919663fea07939a498fa205 > Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/292722 > Commit-Queue: Martin Kustermann <[email protected]> > Reviewed-by: Daco Harkes <[email protected]> > Reviewed-by: Ryan Macnak <[email protected]> Change-Id: I05ad5b9ce24754a68693160e470f8eb71a812c75 No-Presubmit: true No-Tree-Checks: true No-Try: true Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/294860 Auto-Submit: Ryan Macnak <[email protected]> Commit-Queue: Rubber Stamper <[email protected]> Bot-Commit: Rubber Stamper <[email protected]>
For applications that want to have arbitrary number of isolates call into native code that may be blocking, we expose the API functions that allows those native threads to exit an isolate before running long/blocking code. Without the ability to exit/re-enter isolate, one may experience deadlocks as we have a fixed limit on the number of concurrently executing isolates atm. In the longer term we may find a way to do this automatically with low overhead, see [0]. But since those API functions are quite stable and we already expose e.g. `Dart_{Enter,Exit}Scope`, I don't see a reason not to expose `Dart_{Enter,Exit}Isolate`. Difference to original CL: Do use STL synchronization primitives (as the ones in runtime/bin are not always available in shared libraries) [0] Issue #51261 Issue #51254 TEST=ffi{,_2}/dl_api_exit_enter_isolate_test Change-Id: Id817e8d4edb3db35f029248d62388cbd0682001d Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/294980 Reviewed-by: Daco Harkes <[email protected]>
Some users are running into an issue where many isolates are calling out to C code that will then block. This can cause the dart app to no longer work due to our limitation on maximum number of threads that can be active in an isolate group at a given point in time.
The limitation is there to avoid too many threads executing Dart code at same time. This can lead to situations where X threads all have TLABs which may contain unallocated memory, but the X+1 thread tries to obtain TLAB and fails, which will cause it to trigger GC (despite other thread's TLAB still having unallocated memory)
=> Allowing unbounded number of threads to enter an isolate group can lead to excessive triggering of GCs (despite free memory in other thread's TLAB)
See runtime/vm/heap/scavenger.h for the current calculation of the limit:
We may consider adding a boolean flag to specify that a FFI call may be blocking / should exit the isolate.
to automatically exit and re-enter the isolate to avoid custom C code like this:
See motivating use case: #51254
The text was updated successfully, but these errors were encountered: