Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

disable strict RPATH sanity check by default, allow re-enabling it via --strict-rpath-sanity-check configuration option #4475

Merged
merged 7 commits into from
Oct 21, 2024

Conversation

boegel
Copy link
Member

@boegel boegel commented Mar 6, 2024

implement detection of mixing of using RPATH with dependencies that were installed without RPATH

related: easybuilders/easybuild-easyconfigs#20051

Marked as WIP since I'm still verifying this actually works as designed... (works as intended)

@boegel boegel added this to the 5.0 milestone Mar 6, 2024
@boegel boegel requested a review from branfosj March 6, 2024 17:18
@boegel boegel force-pushed the strict_rpath_sanity_check branch from 121bbe9 to 1989fa3 Compare March 6, 2024 18:02
…tion of mixing of using RPATH with dependencies that were installed without RPATH
@boegel boegel force-pushed the strict_rpath_sanity_check branch from 1989fa3 to 1e3153a Compare March 6, 2024 18:03
@branfosj
Copy link
Member

branfosj commented Mar 6, 2024

$ grep "No RPATH section found" /dev/shm/branfosj/tmp-up-EL8/eb-xfrlcn63/easybuild-LAMMPS-2Aug2023_update2-20240306.183110.OmvId.log
== 2024-03-06 18:31:54,959 easyblock.py:3193 INFO No RPATH section found in not
== 2024-03-06 18:31:54,974 easyblock.py:3193 INFO No RPATH section found in not
== 2024-03-06 18:31:56,544 easyblock.py:3193 INFO No RPATH section found in not
== 2024-03-06 18:31:56,559 easyblock.py:3193 INFO No RPATH section found in not
== 2024-03-06 18:31:57,932 easyblock.py:3193 INFO No RPATH section found in not
== 2024-03-06 18:31:57,946 easyblock.py:3193 INFO No RPATH section found in not
No RPATH section found in one or more dependency libraries, so you should probably change your EasyBuild configuration to disable the strict RPATH sanity check that involves unsetting $LD_LIBRARY_PATH when checking for required libraries; see also https://docs.easybuild.io/easybuild-v5/strict-rpath-sanity-check (at easybuild/src/easybuild-framework/easybuild/framework/easyblock.py:3717 in _sanity_check_step)

The regex matches the not in the 'not found' of libhdf5_hl.so.310 => not found.

Sadly this may not help us. It looks like there is an RPATH section in all the libraries checked.

@boegel
Copy link
Member Author

boegel commented Mar 6, 2024

Works as designed:

== FAILED: Installation ended unsuccessfully: Sanity check failed: Library libhdf5_hl.so.310 not found for
/software/LAMMPS/2Aug2023_update2-foss-2023a-kokkos/bin/lmp
Library libbz2.so.1.0 not found for /software/LAMMPS/2Aug2023_update2-foss-2023a-kokkos/bin/lmp
Library libhdf5_hl.so.310 not found for /software/LAMMPS/2Aug2023_update2-foss-2023a-kokkos/lib/liblammps.so.0
Library libbz2.so.1.0 not found for /software/LAMMPS/2Aug2023_update2-foss-2023a-kokkos/lib/liblammps.so.0
Library libhdf5_hl.so.310 not found for /software/LAMMPS/2Aug2023_update2-foss-2023a-kokkos/lib64/liblammps.so.0
Library libbz2.so.1.0 not found for /software/LAMMPS/2Aug2023_update2-foss-2023a-kokkos/lib64/liblammps.so.0

No RPATH section found in one or more dependency libraries, so you should probably change your EasyBuild configuration to disable the strict RPATH sanity check that involves unsetting $LD_LIBRARY_PATH
when checking for required libraries; see also https://docs.easybuild.io/easybuild-v5/strict-rpath-sanity-check (took 22 mins 58 secs)

The docs page still needs to be created.

When using --disable-strict-rpath-sanity-check, the installation completes as expected.

@boegel boegel marked this pull request as ready for review March 6, 2024 18:41
@boegel boegel changed the title add support for disabling strict RPATH sanity check (WIP) add support for disabling strict RPATH sanity check Mar 6, 2024
@boegel
Copy link
Member Author

boegel commented Mar 6, 2024

$ grep "No RPATH section found" /dev/shm/branfosj/tmp-up-EL8/eb-xfrlcn63/easybuild-LAMMPS-2Aug2023_update2-20240306.183110.OmvId.log
== 2024-03-06 18:31:54,959 easyblock.py:3193 INFO No RPATH section found in not
== 2024-03-06 18:31:54,974 easyblock.py:3193 INFO No RPATH section found in not
== 2024-03-06 18:31:56,544 easyblock.py:3193 INFO No RPATH section found in not
== 2024-03-06 18:31:56,559 easyblock.py:3193 INFO No RPATH section found in not
== 2024-03-06 18:31:57,932 easyblock.py:3193 INFO No RPATH section found in not
== 2024-03-06 18:31:57,946 easyblock.py:3193 INFO No RPATH section found in not
No RPATH section found in one or more dependency libraries, so you should probably change your EasyBuild configuration to disable the strict RPATH sanity check that involves unsetting $LD_LIBRARY_PATH when checking for required libraries; see also https://docs.easybuild.io/easybuild-v5/strict-rpath-sanity-check (at easybuild/src/easybuild-framework/easybuild/framework/easyblock.py:3717 in _sanity_check_step)

The regex matches the not in the 'not found' of libhdf5_hl.so.310 => not found.

Sadly this may not help us. It looks like there is an RPATH section in all the libraries checked.

Ah, snap, I overlooked that... I can adjust the regex so that doesn't happen, or simply filter out "not found" hits.

That does make me wonder: if the RPATH section is there for all dependency libraries, then why are the binaries still unable to find some libraries?
Are we basically just missing direct dependencies for LAMMPS?

@branfosj
Copy link
Member

branfosj commented Mar 6, 2024

Are we basically just missing direct dependencies for LAMMPS?

I'm testing a rebuild with netCDF as a dep.

@branfosj
Copy link
Member

branfosj commented Mar 6, 2024

Are we basically just missing direct dependencies for LAMMPS?

I'm testing a rebuild with netCDF as a dep.

And that still fails.

@jfgrimm
Copy link
Member

jfgrimm commented Mar 7, 2024

perhaps lammps uses something like nc-config to determine how netcdf was built? Is there any difference there with/out rpath?

@branfosj
Copy link
Member

branfosj commented Mar 11, 2024

Without and with RPATH both are the same

$ nc-config --all

This netCDF 4.9.2 has been built with the following features:

  --cc            -> /rds/projects/2017/branfosj-rse/easybuild/EL8-ice/software/OpenMPI/4.1.5-GCC-12.3.0/bin/mpicc
  --cflags        -> -I/rds/projects/2017/branfosj-rse/easybuild/EL8-ice/software/netCDF/4.9.2-gompi-2023a/include
  --libs          -> -L/rds/projects/2017/branfosj-rse/easybuild/EL8-ice/software/netCDF/4.9.2-gompi-2023a/lib64 -lnetcdf
  --static        -> -lhdf5_hl -lhdf5 -lm -lz -lzstd -lbz2 -lsz -lcurl -lxml2


  --has-dap          -> yes
  --has-dap2         -> yes
  --has-dap4         -> yes
  --has-nc2          -> yes
  --has-nc4          -> yes
  --has-hdf5         -> yes
  --has-hdf4         -> no
  --has-logging      -> no
  --has-pnetcdf      -> no
  --has-szlib        -> no
  --has-cdf5         -> yes
  --has-parallel4    -> yes
  --has-parallel     -> yes
  --has-nczarr       -> yes
  --has-zstd         -> yes
  --has-benchmarks   -> no
  --has-multifilters -> no
  --has-stdfilters   -> deflate szip zstd bz2
  --has-quantize     -> no

  --prefix        -> /rds/projects/2017/branfosj-rse/easybuild/EL8-ice/software/netCDF/4.9.2-gompi-2023a
  --includedir    -> /rds/projects/2017/branfosj-rse/easybuild/EL8-ice/software/netCDF/4.9.2-gompi-2023a/include
  --libdir        -> /rds/projects/2017/branfosj-rse/easybuild/EL8-ice/software/netCDF/4.9.2-gompi-2023a/lib64
  --plugindir     ->
  --version       -> netCDF 4.9.2

@jfgrimm
Copy link
Member

jfgrimm commented Mar 11, 2024

so, to me it looks like this issue occurs when:

  • libfoo links libbar
  • libbar is not rpath'd, and links libbaz
  • libbaz is not directly linked by libfoo (no NEEDED entry)

for example, in the case of LAMMPS:

  • liblammps.so links libnetcdf.so.19
  • libnetcdf.so.19 is not rpath'd, and links libhdf5_hl.so.310
  • libhdf5_hl.so.310 is only linked by libnetcdf.so.19 directly

@jfgrimm
Copy link
Member

jfgrimm commented Mar 11, 2024

we can figure out where this happens, but the command I came up with isn't exactly quick to run.
Using the above example of libhdf5_hl.so.310 with LAMMPS loaded:

time find -L $(echo $LIBRARY_PATH | sed -E -e 's/:/ /g' -e "s#$EBROOTHDF5/lib(64|)##g" ) -maxdepth 1 -type f -name '*.so*' -exec sh -c "tmp=\$(readelf -d {}); tmp3=\$(echo \$tmp | grep -q 'RPATH'); if \$(echo \$tmp | grep -q 'libhdf5_hl') && ! \$(echo \$tmp | grep -q 'RPATH') ; then echo {}; fi" \;
/project/boegelbot/Rocky8/haswell/software/netCDF/4.9.2-gompi-2023a/lib/libnetcdf.so.19
/project/boegelbot/Rocky8/haswell/software/netCDF/4.9.2-gompi-2023a/lib/libnetcdf.so

real	0m46.667s
user	0m10.543s
sys	0m35.770s

@jfgrimm
Copy link
Member

jfgrimm commented Mar 11, 2024

as an aside, a fast and easy way of detecting a mixed stack would be to grep the latest easybuild/*.test_report.md for each software for \-\-disable-rpath, rather than running readelf

@jfgrimm
Copy link
Member

jfgrimm commented Mar 11, 2024

Actually, this might be easier than I thought.
If no binary/library in the software installation directory (e.g. LAMMPS) has a NEEDED entry for particular missing library, then it should be safe to ignore (perhaps with a warning)

@boegel boegel changed the title add support for disabling strict RPATH sanity check add support for disabling strict RPATH sanity check + print a warning when mixing of non-RPATH and RPATH installations was detected Mar 28, 2024
@boegel boegel added the EasyBuild-5.0-blocker Blocker for EasyBuild 5.0 label Jun 5, 2024
@boegel boegel changed the title add support for disabling strict RPATH sanity check + print a warning when mixing of non-RPATH and RPATH installations was detected disable strict RPATH sanity check by default, allow re-enabling it via --strict-rpath-sanity-check configuration option Oct 7, 2024
Copy link
Member

@jfgrimm jfgrimm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@jfgrimm jfgrimm merged commit b5c1dd9 into easybuilders:5.0.x Oct 21, 2024
39 checks passed
@boegel boegel deleted the strict_rpath_sanity_check branch November 4, 2024 15:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
EasyBuild-5.0-blocker Blocker for EasyBuild 5.0 EasyBuild-5.0 EasyBuild 5.0 enhancement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants