Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(NPP) Shared Library loading issue on Linux #116

Open
hillin opened this issue May 23, 2023 · 5 comments
Open

(NPP) Shared Library loading issue on Linux #116

hillin opened this issue May 23, 2023 · 5 comments

Comments

@hillin
Copy link

hillin commented May 23, 2023

My program complains that it can't find nppisu64_12 when running on Linux:

Unable to load shared library 'nppisu64_12' or one of its dependencies. In order to help diagnose loading problems, consider setting the LD_DEBUG environment variable: libnppisu64_12: cannot open shared object file: No such file or directory

The program is running inside a docker container, based on the nvidia/cuda:12.1.1-runtime-ubuntu20.04 image. I can find the libnppisu.so.12 file, but not libnppisu.so. Creating a link solved the problem.

So the question is, should ManagedCuda try to explicitly load libnppisu.so.12, instead of generally load nppisu? The same goes for other shared libraries.

Edit: fix file names, they don't have the 64 postfix in Linux.

@kunzmi
Copy link
Owner

kunzmi commented May 23, 2023

Hi,

for the moment I can only check on an Ubuntu linux with Cuda 12.0 manually installed and there the corresponding symbolic link is present, i.e. libnppisu.so pointing to libnppisu.so.12 pointing to libnppisu.so.12.0.0.30.

Note that on linux the libs are named libnppisu.so* without the 64, which only appears on Windows. You see the 64 in the error message, because managedCuda falls back to the windows library name in case it doesn't find the linux variants.

You say that you created a link named libnppisu64.so with the 64, why I'm a bit confused what files were actually present and what files or links are missing.

Could you please post a ls of your lib folder and tell what was there from the beginning?

Cheers,
Michael

@hillin
Copy link
Author

hillin commented May 24, 2023

Sorry, the 64 part was a typo, it's not there (main post updated). The problem should be the docker image does not contain those symbolic links.

@kunzmi
Copy link
Owner

kunzmi commented May 24, 2023

I installed latest Cuda 12.1.1 on my Linux and all symbolic links are there. So I would consider this as a bug in the docker image and not a bug in ManagedCuda.
I also prefer to keep the unversioned lib name in ManagedCuda as this would allow to use different Cuda versions as long as the used API calls keep the same. It also keeps maintenance a bit easier...

@hillin
Copy link
Author

hillin commented May 25, 2023

I'm just thinking, since the 12 part already has to be in the Windows DLL name, it might be easier to handle it in a unified way:

private static readonly HashSet<string> _nppLibraries = new {
    "nppc",
    "nppial",
    "nppicc",
    "nppidei",
    "nppif",
    "nppig",
    "nppim",
    "nppist",
    "nppisu",
    "nppitc",
    "npps"
};

private const string _libraryVersion = "12";

private static IntPtr ImportResolver(string libraryName, System.Reflection.Assembly assembly, DllImportSearchPath? searchPath)
{
    if(!_nppLibraries.Contains(libraryName))
    {
        return IntPtr.Zero;
    }

    string? libToLoad = null;
    if (RuntimeInformation.IsOSPlatform(OSPlatform.Linux))
    {
        libToLoad = $"lib{libraryName}.so.{_libraryVersion}";
    }
    else if (RuntimeInformation.IsOSPlatform(OSPlatform.Windows))
    {
        libToLoad = $"{libraryName}64_{_libraryVersion}.dll";
    }

    // ...

This also prevents accidental reference of incorrect version of libraries, if that matters.

Anyways it's a simple fix in the Dockerfile if we keep it as is.

@kunzmi
Copy link
Owner

kunzmi commented May 25, 2023

Do you know if all cuda docker images are concerned or is it only the latest Cuda 12.1 one?
If for some reason all docker images of any version don't contain all necessary symbolic links, one might consider adapting ManagedCuda. If it is only this specific version that has the missing files, I'd keep it as is.

Cheers,
Michael

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants