Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

0 devices found after installation #14

Closed
Joel-De opened this issue Feb 12, 2024 · 13 comments
Closed

0 devices found after installation #14

Joel-De opened this issue Feb 12, 2024 · 13 comments

Comments

@Joel-De
Copy link

Joel-De commented Feb 12, 2024

After installing Linux Kernel 6.7, XRT, and this driver on a fresh install of ubuntu 22.04 I'm seeing 0 devices found when running "xbutil examine", the provided code sample is also seg faulting when attempting to load device(0).

This is being executed on a minisforum PC with a 7940HS (UM790 Pro), any pointers with further debugging tips or solutions would be appreciated.

System Configuration
  OS Name              : Linux
  Release              : 6.7.0-rc8+
  Version              : #1 SMP PREEMPT_DYNAMIC Sun Feb 11 17:27:55 EST 2024
  Machine              : x86_64
  CPU Cores            : 16
  Memory               : 62085 MB
  Distribution         : Ubuntu 22.04.3 LTS
  GLIBC                : 2.35
  Model                : Venus series
  BIOS vendor          : American Megatrends International, LLC.
  BIOS version         : 1.09

XRT
  Version              : 2.17.0
  Branch               : master
  Hash                 : a395e702b2e79b3ec23c9cdc3ab4ad31a0d84eab
  Hash Date            : 2024-02-12 12:04:51
  XOCL                 : 2.17.0, a395e702b2e79b3ec23c9cdc3ab4ad31a0d84eab
  XCLMGMT              : 2.17.0, a395e702b2e79b3ec23c9cdc3ab4ad31a0d84eab
  AMDXDNA              : 2.17.0_20240212, 317e0c67747cbf88e5b5a3a81ba4bdf7bf5b3fc3

Devices present
  0 devices found
@maxzhen
Copy link
Collaborator

maxzhen commented Feb 12, 2024

Please look into dmesg log to see if driver loads properly.

@Joel-De
Copy link
Author

Joel-De commented Feb 12, 2024

Thanks for the quick response,

I can only see the following relevant line
[ 1.633650] xocl: module verification failed: signature and/or required key missing - tainting kernel

I presume this is due to the module being unsigned - Is the only way to fix this is to recompile the kernel with the following flags?

CONFIG_MODULE_SIG=n
CONFIG_MODULE_SIG_ALL=n

@maxzhen
Copy link
Collaborator

maxzhen commented Feb 16, 2024

No, this is harmless. Anything else?

@Joel-De
Copy link
Author

Joel-De commented Feb 16, 2024

Yeah, that's what I figured as well, no other mentions of either XOCL, XCLMGMT or AMDXDNA. Is there a way to check if the device is being recognized on the system at all - independent of driver installation? I'd like to rule out a hardware / motherboard firmware issue. I'm aware that on Windows the NPU/IPU would should up on device manager even without the driver being installed - but I'm unaware if that level of visibility is supported by the installed linux kernel.

ROCm isn't required to be installed for this driver to work correct?

@maxzhen
Copy link
Collaborator

maxzhen commented Feb 16, 2024

On a system with NPU device, on Linux, you can run: "lspci -vd 1022:1502" and you should see the device as shown below:

c3:00.1 Signal processing controller: Advanced Micro Devices, Inc. [AMD] Device 1502
Subsystem: Advanced Micro Devices, Inc. [AMD] Device 1502
Flags: bus master, fast devsel, latency 0, IRQ 67, IOMMU group 25
Memory at b0a00000 (32-bit, non-prefetchable) [size=512K]
Memory at b0ac0000 (32-bit, non-prefetchable) [size=8K]
Memory at 7c10800000 (64-bit, prefetchable) [size=256K]
Memory at b0a80000 (32-bit, non-prefetchable) [size=256K]
Capabilities:
Kernel driver in use: amdxdna
Kernel modules: amdxdna
Could you double check?

@Joel-De
Copy link
Author

Joel-De commented Feb 16, 2024

Interesting, I don't get any output after running that command, would you expect this to work on an older kernel I.E. could an incorrect installation of the 6.7 kernel cause this or is this strictly a firmware issue on the PC vendors part - I'm assuming kernel version wouldn't impact this but I want to be sure before notifying the PC vendor.

@DimitriosKakouris
Copy link

Not all vendors have activated the NPU via firmware, please check with your vendor.

@maxzhen
Copy link
Collaborator

maxzhen commented Feb 16, 2024

Yes, most likely the NPU device is not enabled in your BIOS and I have seen this issue before. Please talk to your vendor.

@Joel-De
Copy link
Author

Joel-De commented Feb 16, 2024

I see, thanks a lot for the help. Feel free to mark this issue as closed - if I'm able to resolve this by some setting in the bios / firmware update later on I'll add a comment for future reference.

@SamuelBayliss
Copy link

All of our minisforum boxes worked out of the box ( since early last year) - I can try and get the bios version for you.

@Joel-De
Copy link
Author

Joel-De commented Feb 16, 2024

That would be greatly appreciated, thanks!

Our bios version is 1.09 which I believe is the latest, so unless something broke in the update last November not sure what else it could be.

@Joel-De
Copy link
Author

Joel-De commented Feb 16, 2024

Quick update, it was in fact a bios setting everything works now - perhaps the default setting for enabling the IPU was disabled in the latest update (ours came preloaded with 1.09) ? Not entirely sure though. Regardless thanks for your help!

Feel free to close

@maxzhen maxzhen closed this as completed Feb 16, 2024
@artulab
Copy link

artulab commented Feb 16, 2024

For future reference, once you enter the BIOS, the IPU setting should be enabled by navigating to Advanced -> CPU Configuration on this PC.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants