-
Notifications
You must be signed in to change notification settings - Fork 865
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make sure opal_start_thread always spawns pthreads #9326
Conversation
The IBM CI (GNU/Scale) build failed! Please review the log, linked below. Gist: https://gist.github.com/e86568ad1ec3570fb3ffbfcff19c0b75 |
The IBM CI (XL) build failed! Please review the log, linked below. Gist: https://gist.github.com/5a6d67f030faae3c74bef8b0f00836cf |
Users of `opal_start_thread` (btl/tcp, ft, smcuda, progress thread) may spawn threads that may block in functions unaware of argobots or qthreads (e.g., libevent or read(3)). If we spawn an argobot or qthread instead of a pthread the thread executing the argobot or qthread (potentially the main thread) blocks, leading to a deadlock situation. Open MPI expects the semantics of a pthread so we should handle all internal threads as such. Signed-off-by: Joseph Schuchart <[email protected]>
286095f
to
e3ca132
Compare
The IBM CI (PGI) build failed! Please review the log, linked below. Gist: https://gist.github.com/88394f49f995536b02d9c90d2997c616 |
With this change all OMPI internal threads will be pthreads, but all the synchronization variables (mutex, conditions and wait_sync) will use whatever thread package has been selected. We tried to understand how the synchronization would then work, and at least in the case of argobots things seem to end-up in a futex (so it might be working as expected). However, this is a major change and before it gets pulled into OMPI we need to understand its impact on correctness and performance. |
The way I see it there are correctness and potential performance issues at play here:
If we want to preserve spawning ULTs instead of pthreads we have to distinguish between the pthread and ULT backends and prevent the ULT from blocking, instead making them polling for whatever they are waiting on. I'm not sure whether a blocking pthread is worse than a progress ULT that polls when scheduled even if there is nothing to progress. And the overhead of distinguishing between pthread and ULTs in the upper layers seems prohibitive. |
Users of
opal_start_thread
(btl/tcp, ft, smcuda, progress thread) may spawn threads that may block in functions unaware of argobots or qthreads (e.g., libevent or read(3)). If we spawn an argobot or qthread instead of a pthread the threadexecuting the argobot or qthread (potentially the main thread) blocks, leading to a deadlock situation. Open MPI expects the semantics of a pthread so we should handle all internal threads as such.
This PR moves
opal_start_thread
and thread identification functions into threads/base and makes them operate on pthreads. As a side-effect, even if the main thread executes an argobot or qthread, it will still be considered the main thread, i.e.,MPI_Is_thread_main
will returnflag = 1
(which is a change in behavior from the current implementation). The argobots and qthread integration will still be used to switch between ULTs to avoid blocking.This should be backported to 5.x once merged.
Signed-off-by: Joseph Schuchart [email protected]