v4.0
Changes in 4.0
-
All MPI-4 APIs have been implemented. Major MPI-4 features include MPI
sessions, partitioned point-to-point communications, events in the MPI tool
information interface, large-count functions, persistent collectives,
MPI_Comm_idup_with_info, MPI_Isendrecv and MPI_Isendrecv_replace,
MPI_Info_get_string, MPI_Comm_split_type with new split_type --
MPI_COMM_TYPE_HW_GUIDED and MPI_COMM_TYPE_HW_UNGUIDED. -
Add QMPI (experimental) support.
-
Add MPIX_Delete_error_{class,code,string}.
-
MPI_Info objects can be accessed before MPI_Init{_thread}.
-
Generate C API interface functions including man page notes and error
checking using Python scripts. -
Generate Fortran (mpif.h, mpi_f08) bindings using Python scripts.
-
Generate collective entrance functions and generate per-algorithm tests.
-
Support explicit --without-cuda configure option.
-
Drop support for UCX version < 1.7.0.
-
Configure now optionally require Python 3 (when F08 is enabled).
-
Multi-NIC support in ch4:ofi.
-
Default to ch4:ofi when configure doesn't have a clear choice. Add message
block at the end of configure to advise user. -
Multiple VCI is fully implemented including the active message fallback paths.
-
Extend IPC to support non-contig datatypes.
-
Add AMD GPU support using HIP.
-
Add generic RNDV callback mechanism with active messages.
-
Refactor ch4 dynamic process functions.
-
Avoid building MPL and hwloc multiple times.
-
Fix MPIX_Query_cuda_support.
-
Many bug fixes and code clean-ups.