Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nvml library is not getting initialized on ubuntu22.04 #116

Open
sujithapallapothu opened this issue May 3, 2024 · 7 comments
Open

nvml library is not getting initialized on ubuntu22.04 #116

sujithapallapothu opened this issue May 3, 2024 · 7 comments

Comments

@sujithapallapothu
Copy link

sujithapallapothu commented May 3, 2024

package main

import (
        "fmt"
        "log"

        "github.com/NVIDIA/go-nvml/pkg/nvml"
)

func main() {
        ret := nvml.Init()
        if ret != nvml.SUCCESS {
                log.Fatalf("Unable to initialize NVML: %v", nvml.ErrorString(ret))
        }
        defer func() {
                ret := nvml.Shutdown()
                if ret != nvml.SUCCESS {
                        log.Fatalf("Unable to shutdown NVML: %v", nvml.ErrorString(ret))
                }
        }()

        count, ret := nvml.DeviceGetCount()
        fmt.Println("count",count)
        if ret != nvml.SUCCESS {
                log.Fatalf("Unable to get device count: %v", nvml.ErrorString(ret))
        }

        for i := 0; i < count; i++ {
                device, ret := nvml.DeviceGetHandleByIndex(i)
                if ret != nvml.SUCCESS {
                        log.Fatalf("Unable to get device at index %d: %v", i, nvml.ErrorString(ret))
                }

                uuid, ret := device.GetUUID()
                if ret != nvml.SUCCESS {
                        log.Fatalf("Unable to get uuid of device at index %d: %v", i, nvml.ErrorString(ret))
                }


                fmt.Printf("%v\n", uuid)

                processInfos, ret := device.GetComputeRunningProcesses()
                if ret != nvml.SUCCESS {
                        log.Fatalf("Unable to get process info for device at index %d: %v", i, nvml.ErrorString(ret))
                }
                fmt.Printf("Found %d processes on device %d\n", len(processInfos), i)
                for pi, processInfo := range processInfos {
                        fmt.Printf("\t[%2d] ProcessInfo: %+v\n", pi, processInfo)
                }



        }

When Im executing above go code, getting below error in my linux device

Error initializing NVML:ERROR_LIBRARY_NOT_FOUND

Can someone please suggest why nvml package is not getting initialized even nvml library is getting imported and do exists in above go file ??

Spec of my linux device follows as:

Ubuntu version: 22.04
Graphical card: 61:00.0 3D controller: NVIDIA Corporation GA100 [A100 PCIe 40GB] (rev a1)
Nvidia Driver version: 550.54.14
CUDA Version: 12.4
Go version: go1.21.9 linux/amd64

@klueska
Copy link
Contributor

klueska commented May 3, 2024

Do you have the NVIDIA driver installed? Where is the libnvidia-ml.so.1 library located on your system?

@sujithapallapothu
Copy link
Author

sujithapallapothu commented May 3, 2024

yes @klueska

I have libnvidia-ml.so.1 in my linux device ( ubunut22.04)

root@ubuntu2204:/tmp# locate libnvidia-ml.so.1

/usr/lib/x86_64-linux-gnu/libnvidia-ml.so.1

@sujithapallapothu
Copy link
Author

image

@elezar
Copy link
Member

elezar commented May 3, 2024

@sujithapallapothu the error message: "Error initializing NVML" does not seem to exist in the go-nvml code base and is also not present in the snippet that you pasted above.

Could you give more information about your environment -- including the output of nvidia-smi?

The code you show seems to come from one of the examples included in the repository, could you check out the latest version off main and run make examples in the root folder. You should be able to run these examples then.

@sujithapallapothu
Copy link
Author

sujithapallapothu commented May 3, 2024

@elezar yes you are right, I have taken code from examples and wrote into my sample.go file which looks like below

	if hasNvidiaGPUs() {
		err := nvml.Init()
		if err != nvml.SUCCESS {
			fmt.Println("Error initializing NVML:", err)
			//return err

		}
		defer nvml.Shutdown()

		deviceCount, err := nvml.DeviceGetCount()
		if err != nvml.SUCCESS {
			fmt.Println("Error getting device count:", err)
			//log.Fatalf("Unable to get device count: %v", nvml.ErrorString(err))
		}
		fmt.Println("Number of NVIDIA GPUs:", deviceCount)
	} else {
		fmt.Println("No Nividia GPUs")
	}

where hasNvidiaGPUs() function checks nvidia graphical card exists or not. I built above code using go build  -tags netgo -ldflags '-s -extldflags "-static"' sample.go and then excuted go binary which results in Error initializing NVML:ERROR_LIBRARY_NOT_FOUND

image

more details about my env is as follows

Ubuntu version: 22.04
Graphical card: 61:00.0 3D controller: NVIDIA Corporation GA100 [A100 PCIe 40GB] (rev a1)
Nvidia Driver version: 550.54.14
CUDA Version: 12.4
Go version: go1.21.9 linux/amd64

Please help further on this.

Thankyou

@sujithapallapothu
Copy link
Author

image

Im getting above error which is in go-nvml code, seems like library loading is failing. Do i need to set any go env flags while building go binary ??

Please suggest @klueska @elezar

@elezar
Copy link
Member

elezar commented May 3, 2024

Note that when we build applications on linux that use this library we specify:

-ldflags "-s -w '-extldflags=-Wl,--export-dynamic -Wl,--unresolved-symbols=ignore-in-object-files'

It could be that the static flag is causign the libnvidia-ml.so.1 library to not be loaded.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants