Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error: LibloadingError(DlOpen { desc: "libnvidia-ml.so: cannot open shared object file: No such file or directory" }) #1

Open
sangshuduo opened this issue Sep 10, 2024 · 3 comments

Comments

@sangshuduo
Copy link

$ nviwatch
Error: LibloadingError(DlOpen { desc: "libnvidia-ml.so: cannot open shared object file: No such file or directory" })

$ uname -a
Linux sn4622120254 5.15.0-101-generic #111-Ubuntu SMP Tue Mar 5 20:16:58 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

$ cat /etc/issue
Ubuntu 22.04.4 LTS \n \l

$ rustc --version
rustc 1.81.0 (eeb90cda1 2024-09-04)

$ cargo --version
cargo 1.81.0 (2dbb1af80 2024-08-20)

@msminhas93
Copy link
Owner

Please check your nvidia drivers and verify that nvidia-smi and nvcc commands are working. Also export the LD_LIBRARY_PATH also.
https://docs.nvidia.com/cuda/cuda-installation-guide-linux/

@sangshuduo
Copy link
Author

$ nvidia-smi
Thu Sep 12 20:37:52 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.14 Driver Version: 550.54.14 CUDA Version: 12.4 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA A100 80GB PCIe Off | 00000000:01:00.0 Off | 0 |

After I export LD_LIBRARY_PATH with the location of libnvidia-ml.so and run nviwatch again. It reports following:

$ nviwatch

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
WARNING:

You should always run with libnvidia-ml.so that is installed with your
NVIDIA Display Driver. By default it's installed in /usr/lib and /usr/lib64.
libnvidia-ml.so in GDK package is a stub library that is attached only for
build purposes (e.g. machine that you build your application doesn't have
to have Display Driver installed).
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
lsof: WARNING: can't stat() tracefs file system /sys/kernel/debug/tracing
Output information may be incomplete.
lsof: WARNING: can't stat() fuse.gvfsd-fuse file system /run/user/1005/gvfs
Output information may be incomplete.
Linked to libnvidia-ml library at wrong path : /usr/local/cuda-12.4/targets/x86_64-linux/lib/stubs/libnvidia-ml.so

Error: DriverNotLoaded

@msminhas93
Copy link
Owner

I think I was getting the same error few weeks ago on my wsl and I had added these two lines which seemed to fix the issue.

export PATH=$PATH:/usr/local/cuda-12.2/bin
export LD_LIBRARY_PATH=/usr/lib/wsl/lib:/usr/lib/wsl/drivers/:$LD_LIBRARY_PATH

So for your case it would be 12.4. I'm speculating that the lib or lib64 whichever your system has isn't available in the ld path.

Rough equivalent for the drivers path on ubuntu is:

export PATH=$PATH:/usr/local/cuda-12.4/bin
export LD_LIBRARY_PATH=/usr/lib:/usr/lib/modules/$(uname -r):$LD_LIBRARY_PATH

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants