Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Java bindings not loading in OS X 10.11 (El Capitan) #1220

Closed
shurickdaryin opened this issue Dec 14, 2015 · 9 comments
Closed

Java bindings not loading in OS X 10.11 (El Capitan) #1220

shurickdaryin opened this issue Dec 14, 2015 · 9 comments

Comments

@shurickdaryin
Copy link

OS X 10.11 does not allow DYLD_LIBRARY_PATH environment variable to be propagated to child processes. As such, it is not possible to find libmpi.dylib:

mpirun -np 4 java -jar myapp.jar
Java bindings failed to load libmpi: dlopen(libmpi.dylib, 10): image not found
Java bindings failed to load libmpi: dlopen(libmpi.dylib, 10): image not found
Java bindings failed to load libmpi: dlopen(libmpi.dylib, 10): image not found
Java bindings failed to load libmpi: dlopen(libmpi.dylib, 10): image not found
-------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code.. Per user-direction, the job has been aborted.
-------------------------------------------------------
--------------------------------------------------------------------------
mpirun detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:

  Process name: [[9970,1],3]
  Exit code:    1
--------------------------------------------------------------------------

See oracle/node-oracledb#231, http://apple.stackexchange.com/questions/215030/el-capitan-make-check-dyld-library-path

A workaround is to copy libmpi.dylib to the current directory.

Probably a full path name should be used to load libmpi.dylib to avoid using DYLD_LIBRARY_PATH.

@ggouaillardet
Copy link
Contributor

@shurickdaryin did you configure with --enable-mpirun-prefix-by-default ?
iirc, that should fix the issue

@shurickdaryin
Copy link
Author

@ggouaillardet I reconfigured with --enable-mpirun-prefix-by-default and rebuilt openmpi and my app, but still get the same error.

@ggouaillardet
Copy link
Contributor

were you able to test with Yosemite ? and did you get the same error ?
btw, do you run on only one node ?(vs several nodes)
a possible fix is to update ompi/mpi/java/c/mpi_MPI.c and dlopen with the full path if ompi is configured with --enabke-mpirun-prefix-by-default
I am still running yosemite, so I might ask you to test a fix tomorrow

@shurickdaryin
Copy link
Author

No, it's El Capitan all around me. But I have tried to hard code the path to libmpi.dylib in the indicated file and it works. So the proposed fix should also work.

@shurickdaryin
Copy link
Author

This has been run on one node.

@jsquyres
Copy link
Member

@hppritcha Can you confirm? If so, it may be necessary to use installdirs to find libmpi at run time.

@hppritcha
Copy link
Member

As I recall one needs to pretend LD_LIBRARY_PATH with location of OMPI install lib on os-x. It's owing to the dlopen of libmpi iin the JNI code. This isn't necessary on Linux. I don't think this is specific to el capitan. I will update the java FAQ.

@shurickdaryin
Copy link
Author

Setting LD_LIBRARY_PATH or DYLD_LIBRARY_PATH does not change anything, as these variables are not exported to child processes. This behaviour is new with El Capitan.

I have both LD_LIBRARY_PATH and DYLD_LIBRARY_PATH set to /opt/openmpi/lib. With this setting, "otool -L libmpi.dylib" returns an error, while "otool -L /opt/openmpi/lib/libmpi.dylib" correctly reports library and its dependencies.

Another manifestation of this behaviour is that "LD_LIBRARY_PATH=foo env | grep LD" doesn't print anything, while "_LD_LIBRARY_PATH=foo env | grep LD" prints "_LD_LIBRARY_PATH=foo".

@jsquyres
Copy link
Member

@hppritcha Ouch -- it sounds like El Cap made a Big Change in this area! This code might need to be updated per what I said above (i.e., use installdirs to find libmpi)...

ggouaillardet added a commit to ggouaillardet/ompi that referenced this issue Dec 16, 2015
Since OS X 10.11 (aka El Capitan) DYLD_LIBRARY_PATH is no more
propagated to children, so try to dlopen libmpi with the full path
(set at configure time)

Fixes open-mpi#1220

Thanks Alexander Daryin for reporting this
ggouaillardet added a commit to ggouaillardet/ompi that referenced this issue Dec 17, 2015
Since OS X 10.11 (aka El Capitan) DYLD_LIBRARY_PATH is no more
propagated to children, so try to dlopen libmpi with the full path
using the directory of libmpi_java

Fixes open-mpi#1220

Thanks Alexander Daryin for reporting this
ggouaillardet added a commit to ggouaillardet/ompi that referenced this issue Dec 18, 2015
Since OS X 10.11 (aka El Capitan) DYLD_LIBRARY_PATH is no more
propagated to children, so try to dlopen libmpi with the full path
using the directory of libmpi_java

Fixes open-mpi#1220

Thanks Alexander Daryin for reporting this
ggouaillardet added a commit to ggouaillardet/ompi that referenced this issue Dec 22, 2015
Since OS X 10.11 (aka El Capitan) DYLD_LIBRARY_PATH is no more
propagated to children, so try to dlopen libmpi with the full path
using the directory of libmpi_java

Fixes open-mpi#1220

Thanks Alexander Daryin for reporting this
ggouaillardet added a commit to ggouaillardet/ompi that referenced this issue Dec 22, 2015
Since OS X 10.11 (aka El Capitan) DYLD_LIBRARY_PATH is no more
propagated to children, so try to dlopen libmpi with the full path
using the directory of libmpi_java

Fixes open-mpi#1220

Thanks Alexander Daryin for reporting this
ggouaillardet added a commit to ggouaillardet/ompi-release that referenced this issue Dec 22, 2015
Since OS X 10.11 (aka El Capitan) DYLD_LIBRARY_PATH is no more
propagated to children, so try to dlopen libmpi with the full path
using the directory of libmpi_java

Fixes open-mpi/ompi#1220

Thanks Alexander Daryin for reporting this

(cherry picked from commit open-mpi/ompi@e918d75)
ggouaillardet added a commit to ggouaillardet/ompi-release that referenced this issue Dec 22, 2015
Since OS X 10.11 (aka El Capitan) DYLD_LIBRARY_PATH is no more
propagated to children, so try to dlopen libmpi with the full path
using the directory of libmpi_java

Fixes open-mpi/ompi#1220

Thanks Alexander Daryin for reporting this

(back-ported from commit open-mpi/ompi@e918d75)
annu13 added a commit to annu13/ompi that referenced this issue Dec 22, 2015
java: try do dlopen libmpi with the full path

Since OS X 10.11 (aka El Capitan) DYLD_LIBRARY_PATH is no more
propagated to children, so try to dlopen libmpi with the full path
using the directory of libmpi_java

Fixes open-mpi#1220

Thanks Alexander Daryin for reporting this

Update symbol-hiding script

btl/sm: rename file after file descriptor has been closed.

Thanks George for spotting this.
annu13 added a commit to annu13/ompi that referenced this issue Dec 22, 2015
java: try do dlopen libmpi with the full path

Since OS X 10.11 (aka El Capitan) DYLD_LIBRARY_PATH is no more
propagated to children, so try to dlopen libmpi with the full path
using the directory of libmpi_java

Fixes open-mpi#1220

Thanks Alexander Daryin for reporting this

Update symbol-hiding script

btl/sm: rename file after file descriptor has been closed.

Thanks George for spotting this.
jsquyres pushed a commit to jsquyres/ompi that referenced this issue Aug 23, 2016
…lock-args

v2.x: ompi/datatype: Fix args of HINDEXED_BLOCK
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants