Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new multithreaded concurrency configuration #5015

Merged
merged 627 commits into from
Dec 19, 2024

Conversation

qkoziol
Copy link
Contributor

@qkoziol qkoziol commented Oct 26, 2024

Added infrastructure support for multithreaded concurrency by adding an optional way to switch to using a non-recursive R/W lock for the global API lock. This is enabled with a new 'concurrency' configuration flag for the autotools & CMake builds, which is disabled by default.

When the 'concurrency' build option is chosen, the global API lock will use the R/W lock and all API calls currently will acquire a write lock, ensuring exclusive access by one thread. Over time, the API routines that are converted to support multithreaded concurrency will switch to acquiring a read lock instead.

Reentering the library from application callbacks is managed by the 'disable locking for this thread' (DLFTT) threadsafety protocol. This is internally handled within the H5_API_LOCK / H5_API_UNLOCK macros in H5private.h (as before), which invoke the 'dlftt' routines in H5TSint.c.

To support this change, the threadsafety configuration macros for the library have been updated:
- --enable-threadsafe now defines the H5_HAVE_THREADSAFE macro
- --enable-concurrency defines the H5_HAVE_CONCURRENCY macro
The new H5_HAVE_THREADSAFE_API macro is set if either H5_HAVE_THREADSAFE or H5_HAVE_CONCURRENCY is enabled.

New Github actions are added to include the concurrency configuration in the CI for the develop branch.

To support the new non-recursive R/W locking for API routines, some other changes are necessary:

  • Added macro wrappers around all callback invocations that could call an
    application function, and therefore re-enter the library:
    H5_BEFORE_USER_CB* / H5_AFTER_USER_CB*

  • Added H5_user_cb_prepare / H5_user_cb_restore routines that save the
    state of the library when callback leaves the library. Includes error
    stack and threadsafe reentry state currently.

There's also some small cleanups to various places in the library:

  • Moved the H5E_mpi_error_str / H5E_mpi_error_str_len globals to be local for
    pushing MPI errors, so that multiple threads can't interfere with each
    other.

  • Added H5TS_rwlock_trywrlock() routine to R/W lock interface.

  • Emulate R/W locks on MacOS because its implementation of
    pthread_rwlock_wrlock() does not conform to the POSIX standard.

  • Don't acquire the global API lock in H5close, since it's acquired in H5_term_library, which is necessary because H5_term_library is invoked via other code paths that don't hold the global API lock.

  • Don't call H5Eget_auto2 API routine within H5_term_library.

  • Switched 'return NULL' in H5allocate_memory to HGOTO_DONE(NULL).

  • Switched H5Pget_file_space_strategy / H5Pset_file_space_strategy to use
    internal routines instead of API routines.

  • Switched H5Oopen_by_addr & H5Ovisit1 to use internal routines instead of
    API routines.

  • Fixed a few places in src/H5Odeprec.c where a major
    error ID was passed as a minor ID.

qkoziol and others added 30 commits May 31, 2024 14:45
Signed-off-by: Quincey Koziol <[email protected]>
Signed-off-by: Quincey Koziol <[email protected]>
Signed-off-by: Quincey Koziol <[email protected]>
Signed-off-by: Quincey Koziol <[email protected]>
Signed-off-by: Quincey Koziol <[email protected]>
Signed-off-by: Quincey Koziol <[email protected]>
Signed-off-by: Quincey Koziol <[email protected]>
Signed-off-by: Quincey Koziol <[email protected]>
Signed-off-by: Quincey Koziol <[email protected]>
Signed-off-by: Quincey Koziol <[email protected]>
Signed-off-by: Quincey Koziol <[email protected]>
Signed-off-by: Quincey Koziol <[email protected]>
Signed-off-by: Quincey Koziol <[email protected]>
Signed-off-by: Quincey Koziol <[email protected]>
And a few missing VOL connector callbacks

Signed-off-by: Quincey Koziol <[email protected]>
Signed-off-by: Quincey Koziol <[email protected]>
mattjala
mattjala previously approved these changes Dec 16, 2024
Signed-off-by: Quincey Koziol <[email protected]>
mattjala
mattjala previously approved these changes Dec 17, 2024
src/H5FD.c Outdated Show resolved Hide resolved
@fortnern
Copy link
Member

Almost done, I'll finish reviewing tomorrow

@qkoziol
Copy link
Contributor Author

qkoziol commented Dec 17, 2024

Almost done, I'll finish reviewing tomorrow

That would be awesome - I'm totally ready to get this branch merged, after ~7 months of maintaining it ;-)

@byrnHDF
Copy link
Contributor

byrnHDF commented Dec 18, 2024

Just if we want to test them per PR.
If you want CDash reports - then you need to add to cmake-script.yml.
cmake-ctest.yml and release-files.yml are for creating binaries for snapshots.

@qkoziol
Copy link
Contributor Author

qkoziol commented Dec 19, 2024

Just if we want to test them per PR. If you want CDash reports - then you need to add to cmake-script.yml. cmake-ctest.yml and release-files.yml are for creating binaries for snapshots.

OK, thanks for the info. I think I'll leave them alone for the concurrency builds, for now.

@derobins derobins merged commit 331193f into HDFGroup:develop Dec 19, 2024
@qkoziol qkoziol deleted the locking_protocol branch December 20, 2024 19:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component - Build CMake, Autotools Component - C Library Core C library issues (usually in the src directory) Component - Testing Code in test or testpar directories, GitHub workflows Type - Bug / Bugfix Please report security issues to [email protected] instead of creating an issue on GitHub Type - Improvement Improvements that don't add a new feature or functionality Type - New Feature Add a new API call, functionality, or tool
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants