Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CMake: Find MPI in HDF5 config #2400

Merged
merged 1 commit into from
Jan 18, 2023

Conversation

kwryankrattiger
Copy link
Contributor

Downstream packages depending on HDF5 that don't perform find_package(MPI) before find_package(hdf5) fail to configure. A possible fix to this is to move the MPI link dependency to a private scope.

@lrknox Here is a patch that fixes the MPI linking issue observed in the spack CI. I tested this as a patch in spack and it fixes the issue for SZ and likely others.

@jhendersonHDF
Copy link
Collaborator

Hi @kwryankrattiger,

I'm wondering, with API routines such as http://portal.hdfgroup.org/display/HDF5/H5P_SET_FAPL_MPIO, won't HDF5 need to keep MPI in the public link interface when built as a parallel-enabled library so that an application wishing to use that API will see MPI as a dependency?

@kwryankrattiger
Copy link
Contributor Author

I'm wondering, with API routines such as...

I was not aware of those, I thought HDF5 had completely abstracted parallel components in its interface. Since that is the case, then all of the previous versions since 1.4 have had a minor configuration bug!

The correct solution then is likely to add a call to find_pacakge(MPI QUIET REQUIRED COMPONENTS C CXX Fortran) to the hdf5-config.cmake.in file so that MPI is found correctly even if the dependent project doesn't directly use MPI but uses an MPI enabled HDF5.

I pushed another fix for this that I tested with building sz.

@byrnHDF
Copy link
Contributor

byrnHDF commented Jan 17, 2023

The correct solution then is likely to add a call to find_pacakge(MPI QUIET REQUIRED COMPONENTS C CXX Fortran) to the hdf5-config.cmake.in file so that MPI is found correctly even if the dependent project doesn't directly use MPI but uses an MPI enabled HDF5.

My plan too!

Copy link
Contributor

@byrnHDF byrnHDF left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just what I was looking at!
Thanks for verification.

@byrnHDF
Copy link
Contributor

byrnHDF commented Jan 17, 2023

We will need this same change in develop first.

@kwryankrattiger kwryankrattiger changed the base branch from hdf5_1_14 to develop January 17, 2023 20:03
@kwryankrattiger
Copy link
Contributor Author

@byrnHDF rebased on develop and changed merge base. I already have a patch for 1.14.0 for spack, I will attach it there so @lrknox can get that merged.

@kwryankrattiger kwryankrattiger changed the title CMake: Remove MPI from the public link interface CMake: Find MPI in HDF5 config Jan 17, 2023
@hmaarrfk
Copy link

hmaarrfk commented Jan 18, 2023

(unsolicited opinion) In my mind, it should still be linked privately.

That way, when the users "link" they will be pointed to the appropriate MPI library, instead of the HDF5 implementation.

@kwryankrattiger
Copy link
Contributor Author

@hmaarrfk Since HDF5 is exposing MPI symbols in the interface, publicly linking the MPI_C target is probably the correct thing to do. Since this patch is adding a find_package(MPI) to the hdf5-config.cmake it will use the MPI available/specified for the application not the one that was necessarily used for building HDF5. CMake is only looking for a target MPI::MPI_C to be defined, publicly linking MPI::MPI_C does not hard code paths to the libhdf5 target.

It is possible to spoof an MPI::MPI_C target and achieve the same result, however that may not behave well with FindMPI.

@lrknox
Copy link
Collaborator

lrknox commented Jan 18, 2023

Fix for #2404.

@lrknox lrknox merged commit 509fe96 into HDFGroup:develop Jan 18, 2023
lrknox pushed a commit to lrknox/hdf5 that referenced this pull request Jan 18, 2023
lrknox pushed a commit to lrknox/hdf5 that referenced this pull request Jan 18, 2023
lrknox pushed a commit to lrknox/hdf5 that referenced this pull request Jan 18, 2023
lrknox pushed a commit to lrknox/hdf5 that referenced this pull request Jan 18, 2023
@hmaarrfk
Copy link

Ok, I think i lost track of the original suggestion.

Looking through the code, I can only find:

target_link_libraries(PRIVATE ... MPI::MPI_C)

which is generally what I've found works when integrating libraries together.

Thank you for your time explaining things

Best,

@jwsblokland
Copy link
Contributor

Interesting discussion. I did encounter the same problem when building the H5Z-ZFP filter using HDF5 1.14.0. This has been fixed by using find_package(MPI) in the H5Z-ZFP filter package itself. However, this pull request is a much better solution.

Now I have one small question to the people of HDF group like @byrnHDF and @lrknox which HDF5 versions contains this CMake configuration mistake? Is it only 1.14.0? Personally, I would like to specialize the solution in H5Z-ZFP filter package such we make us of this solution when it has been released.

lrknox added a commit that referenced this pull request Jan 19, 2023
@lrknox
Copy link
Collaborator

lrknox commented Jan 19, 2023

@jwsblokland, HDF5 1.14.0 was the first HDF5 release that included an MPI::MPI_C dependency when PARALLEL was enabled, declared PUBLIC in src/CMakeLists.txt. Previous releases only had PRIVATE declarations of the MPI::MPI_C dependency. Pull requests to include the code in this pull request have been submitted for all active hdf5 development branches; all future releases will include this fix.

@hmaarrfk
Copy link

I didn't see the PUBLIC linkage in src/CMakeLists.txt I'll write a note to myself to remind myself to try to demonstrate the issue a little more tightly.

But, generally, the C compiler, and dynamic linker, can get confused in terms of which binding to link to at run time. This can lead to surprising results.

Generally, you always want to link privately when using shared libraries. Users will have to provide their own library at link to integrate with yours.

I think adding the

FindPackage(MPI)

is good too. But you still want to point to the linker back to the original MPI library, not HDF5.

lrknox added a commit that referenced this pull request Jan 20, 2023
lrknox added a commit that referenced this pull request Jan 20, 2023
@kwryankrattiger
Copy link
Contributor Author

@hmaarrfk Let me know if the following description answers your questions, I think maybe the terms public/private are crossing contexts so I am hoping this will make it more clear what is going on. 🙂

Generally, you always want to link privately when using shared libraries. Users will have to provide their own library at link to integrate with yours.

The context of private/public here is only in CMake and what gets added to be linked/included for dependent targets, it doesn't actually change how things are linked.

What the public linking is saying here is a target named MPI::MPI_C is required when linking hdf5, but it doesn't store any additional meta data on what is in the target MPI::MPI_C or where it "should" come from. It also doesn't change anything about how the link flags are specified, just that if there are libraries or include paths to in the MPI::MPI_C target add those to the dependent targets of hdf5.

But you still want to point to the linker back to the original MPI library, not HDF5.

And that is what happens. By putting MPI::MPI_C in that publicly visible link targets of HDF5, it is saying that dependent libraries of HDF5 should also link/include MPI, But it doesn't say from where. Adding find_package(MPI) is then going to answer the question of where is MPI and what is in the target MPI::MPI_C.

If for some reason a package wanted to ensure that it used exactly the same MPI as the HDF5 build, it could use define it's own MPI::MPI_C and manually set MPI_C_FOUND. However...I would really recommend against that since I can only imagine that would only lead to creating more problems.


An more concise example of what was causing the error this patch is addressing:

find_package(hdf5)
find_package(MPI)

add_executable(my_target main.c)
target_link_libraries(my_target PRIVATE hdf5::hdf5 MPI::MPI_C)

This was not working without the find_package(MPI) in the hdf5-config.cmake because the user is finding MPI after finding HDF5. Note, because hdf5::hdf5 is now linking MPI::MPI_C publicly, this can be equivalently expressed without adding the MPI::MPI_C. This is especially important in the case of a serial application.

In the case of a serial application trying to use a parallel enabled HDF5:

find_package(hdf5)

add_executable(my_target PRIVATE hdf5::hdf5)

This example would likely fail for older versions of HDF5 as well. The H5public.h and H5pubconf.h is generate at configure time, so H5_HAVE_PARALLEL will be 1 requiring mpi.h to be found in some include path. Since MPI was a privately visible target, my_target would not be able to find mpi.h unless it was installed in a default search path on the system or manually linked. This is likely a "bug" in older versions, but one I think most people have overcome simply by just building serial and parallel versions of HDF5 and using whichever one matches their application.

dynamic linker, can get confused in terms of which binding to link to at run time. This can lead to surprising results.

Yeah, this is one of those things that really requires users to understand what they are doing and there isn't really a good way to avoid it other than bundling and setting runpaths (like spack) or making sure the MPIs you use are ABI compatible. But this is a separate issue from the CMake target link visibility.

@hmaarrfk
Copy link

Truthfully, my skills at inspecting binaries are just not that good.

I've tried to remove the lines that include PUBLIC ... MPI:MPI_C and the output of bash build.sh && (nm build/bin/libhdf5.so | grep MPI_Comm_free) doesn't seem to change from

                 U MPI_Comm_free
                 U MPI_Comm_free_keyval
build.sh
#/usr/bin/env bash
set -ex

rm -rf build
mkdir build
cd build
PREFIX=${CONDA_PREFIX}
cmake \
    ${CMAKE_ARGS} \
    -DONLY_SHARED_LIBS=ON \
    -DBUILD_SHARED_LIBS=ON \
    -DCMAKE_INSTALL_PREFIX=${PREFIX} \
    -DHDF5_ENABLE_PARALLEL=ON \
    ..

make -j10

Maybe I am missing something but I tried the following patch to the develop branch:

diff --git a/config/cmake/hdf5-config.cmake.in b/config/cmake/hdf5-config.cmake.in
index 1a3fb7bbf2..35cee4f606 100644
--- a/config/cmake/hdf5-config.cmake.in
+++ b/config/cmake/hdf5-config.cmake.in
@@ -63,8 +63,6 @@ if (${HDF5_PACKAGE_NAME}_ENABLE_PARALLEL)
     set (${HDF5_PACKAGE_NAME}_MPI_Fortran_INCLUDE_PATH "@MPI_Fortran_INCLUDE_DIRS@")
     set (${HDF5_PACKAGE_NAME}_MPI_Fortran_LIBRARIES    "@MPI_Fortran_LIBRARIES@")
   endif ()
-
-  find_package(MPI QUIET REQUIRED)
 endif ()
 
 if (${HDF5_PACKAGE_NAME}_BUILD_JAVA)
diff --git a/src/CMakeLists.txt b/src/CMakeLists.txt
index 355881d72f..b010de458f 100644
--- a/src/CMakeLists.txt
+++ b/src/CMakeLists.txt
@@ -1245,7 +1245,7 @@ if (NOT ONLY_SHARED_LIBS)
   TARGET_C_PROPERTIES (${HDF5_LIB_TARGET} STATIC)
   target_link_libraries (${HDF5_LIB_TARGET}
       PRIVATE ${LINK_LIBS} ${LINK_COMP_LIBS}
-      PUBLIC $<$<NOT:$<PLATFORM_ID:Windows>>:${CMAKE_DL_LIBS}> "$<$<BOOL:${HDF5_ENABLE_PARALLEL}>:MPI::MPI_C>"
+              $<$<NOT:$<PLATFORM_ID:Windows>>:${CMAKE_DL_LIBS}> "$<$<BOOL:${HDF5_ENABLE_PARALLEL}>:MPI::MPI_C>"
   )
   if (NOT WIN32)
     target_link_libraries (${HDF5_LIB_TARGET}
@@ -1289,7 +1289,7 @@ if (BUILD_SHARED_LIBS)
   target_link_libraries (${HDF5_LIBSH_TARGET}
       PRIVATE ${LINK_LIBS} ${LINK_COMP_LIBS}
               $<$<OR:$<BOOL:${HDF5_ENABLE_THREADSAFE}>,$<BOOL:${HDF5_ENABLE_SUBFILING_VFD}>>:Threads::Threads>
-      PUBLIC $<$<NOT:$<PLATFORM_ID:Windows>>:${CMAKE_DL_LIBS}> "$<$<BOOL:${HDF5_ENABLE_PARALLEL}>:MPI::MPI_C>"
+              $<$<NOT:$<PLATFORM_ID:Windows>>:${CMAKE_DL_LIBS}> "$<$<BOOL:${HDF5_ENABLE_PARALLEL}>:MPI::MPI_C>"
   )
   set_global_variable (HDF5_LIBRARIES_TO_EXPORT "${HDF5_LIBRARIES_TO_EXPORT};${HDF5_LIBSH_TARGET}")
   H5_SET_LIB_OPTIONS (${HDF5_LIBSH_TARGET} ${HDF5_LIB_NAME} SHARED "LIB")

When I build your little project example, somehow, MPI is still linked :/
little_hdf5_project.zip

$ bash build.sh && ldd build/my_target 
+ rm -rf build
+ mkdir build
+ pushd build
~/git/htest/build ~/git/htest
+ PREFIX=/home/mark/mambaforge/envs/dev
+ cmake -DCMAKE_AR=/home/mark/mambaforge/envs/dev/bin/x86_64-conda-linux-gnu-ar -DCMAKE_CXX_COMPILER_AR=/home/mark/mambaforge/envs/dev/bin/x86_64-conda-linux-gnu-gcc-ar -DCMAKE_C_COMPILER_AR=/home/mark/mambaforge/envs/dev/bin/x86_64-conda-linux-gnu-gcc-ar -DCMAKE_RANLIB=/home/mark/mambaforge/envs/dev/bin/x86_64-conda-linux-gnu-ranlib -DCMAKE_CXX_COMPILER_RANLIB=/home/mark/mambaforge/envs/dev/bin/x86_64-conda-linux-gnu-gcc-ranlib -DCMAKE_C_COMPILER_RANLIB=/home/mark/mambaforge/envs/dev/bin/x86_64-conda-linux-gnu-gcc-ranlib -DCMAKE_LINKER=/home/mark/mambaforge/envs/dev/bin/x86_64-conda-linux-gnu-ld -DCMAKE_STRIP=/home/mark/mambaforge/envs/dev/bin/x86_64-conda-linux-gnu-strip -DCMAKE_INSTALL_PREFIX=/home/mark/mambaforge/envs/dev ..
-- The C compiler identification is GNU 11.3.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /home/mark/mambaforge/envs/dev/bin/x86_64-conda-linux-gnu-cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Found HDF5: hdf5-shared (found version "1.15.0")  
-- Configuring done
-- Generating done
CMake Warning:
  Manually-specified variables were not used by the project:

    CMAKE_CXX_COMPILER_AR
    CMAKE_CXX_COMPILER_RANLIB


-- Build files have been written to: /home/mark/git/htest/build
+ make V=1 VERBOSE=1
/home/mark/mambaforge/envs/dev/bin/cmake -S/home/mark/git/htest -B/home/mark/git/htest/build --check-build-system CMakeFiles/Makefile.cmake 0
/home/mark/mambaforge/envs/dev/bin/cmake -E cmake_progress_start /home/mark/git/htest/build/CMakeFiles /home/mark/git/htest/build//CMakeFiles/progress.marks
make  -f CMakeFiles/Makefile2 all
make[1]: Entering directory '/home/mark/git/htest/build'
make  -f CMakeFiles/my_target.dir/build.make CMakeFiles/my_target.dir/depend
make[2]: Entering directory '/home/mark/git/htest/build'
cd /home/mark/git/htest/build && /home/mark/mambaforge/envs/dev/bin/cmake -E cmake_depends "Unix Makefiles" /home/mark/git/htest /home/mark/git/htest /home/mark/git/htest/build /home/mark/git/htest/build /home/mark/git/htest/build/CMakeFiles/my_target.dir/DependInfo.cmake --color=
make[2]: Leaving directory '/home/mark/git/htest/build'
make  -f CMakeFiles/my_target.dir/build.make CMakeFiles/my_target.dir/build
make[2]: Entering directory '/home/mark/git/htest/build'
[ 50%] Building C object CMakeFiles/my_target.dir/main.c.o
/home/mark/mambaforge/envs/dev/bin/x86_64-conda-linux-gnu-cc   -march=nocona -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O2 -ffunction-sections -pipe -isystem /home/mark/mambaforge/envs/dev/include -MD -MT CMakeFiles/my_target.dir/main.c.o -MF CMakeFiles/my_target.dir/main.c.o.d -o CMakeFiles/my_target.dir/main.c.o -c /home/mark/git/htest/main.c
[100%] Linking C executable my_target
/home/mark/mambaforge/envs/dev/bin/cmake -E cmake_link_script CMakeFiles/my_target.dir/link.txt --verbose=1
/home/mark/mambaforge/envs/dev/bin/x86_64-conda-linux-gnu-cc -march=nocona -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O2 -ffunction-sections -pipe -isystem /home/mark/mambaforge/envs/dev/include -Wl,-O2 -Wl,--sort-common -Wl,-z,relro -Wl,-z,now -Wl,--disable-new-dtags -Wl,--gc-sections -Wl,--allow-shlib-undefined -Wl,-rpath,/home/mark/mambaforge/envs/dev/lib -Wl,-rpath-link,/home/mark/mambaforge/envs/dev/lib -L/home/mark/mambaforge/envs/dev/lib CMakeFiles/my_target.dir/main.c.o -o my_target  /home/mark/mambaforge/envs/dev/lib/libhdf5.so.1000.0.0 
make[2]: Leaving directory '/home/mark/git/htest/build'
[100%] Built target my_target
make[1]: Leaving directory '/home/mark/git/htest/build'
/home/mark/mambaforge/envs/dev/bin/cmake -E cmake_progress_start /home/mark/git/htest/build/CMakeFiles 0
+ popd
~/git/htest
+ ./build/my_target
HDF5 Version 1.15.0
	linux-vdso.so.1 (0x00007ffc237dc000)
	libhdf5.so.1000 => /home/mark/mambaforge/envs/dev/lib/libhdf5.so.1000 (0x00007f9692f55000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f9692d18000)
	libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f9692c31000)
	libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f9692c2c000)
	libmpi.so.40 => /home/mark/mambaforge/envs/dev/lib/libmpi.so.40 (0x00007f9692b0a000)
	libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f9692b05000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f969338d000)
	libopen-rte.so.40 => /home/mark/mambaforge/envs/dev/lib/./libopen-rte.so.40 (0x00007f9692a4b000)
	libopen-pal.so.40 => /home/mark/mambaforge/envs/dev/lib/./libopen-pal.so.40 (0x00007f969294a000)
	librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f9692945000)
	libz.so.1 => /home/mark/mambaforge/envs/dev/lib/././libz.so.1 (0x00007f9692929000)
	libutil.so.1 => /lib/x86_64-linux-gnu/libutil.so.1 (0x00007f9692924000)

I'll leave this conversation in the back of my head in case any problems come up.

lrknox added a commit that referenced this pull request Jan 25, 2023
brtnfld pushed a commit to brtnfld/hdf5 that referenced this pull request May 17, 2023
brtnfld pushed a commit to brtnfld/hdf5 that referenced this pull request Oct 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Merges Complete
Development

Successfully merging this pull request may close these issues.

6 participants