-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Radeon VII #175
Comments
Hi, that would be really interesting! Most of the rocm components seems to had by default the support for in code for the Radeon VII/gfx906 in place by default and couple of weeks ago I went through all the common places typically requiring patching. I have tested that everything should now build for these cards but not been able to test functionality with VII. That said, if you have time it would be great if you could try to make the build and test it. These steps should help you to get started.
And once the build has been progressed to building rocminfo and amd-smi, those command would be good way to start checking the build.
Hip and opencl -compiler tests should be also doable pretty soon (no need to wait whole build to finish)
Once the build has finished, if things works well, then also pytorch should have the support for you gpu.
If these works, then you can also build the llama_cpp, stable-diffusion-webui and vllm with command: ./babs.sh -b binfo/extra/ai_tools.blist All of those have also own example apps you can run either on console or by starting their web-server and then connecting to it via browser. (I can help more later if needed) |
Thanks :-) I'll try that asap |
Hello @lamikr , Thank you for your amazing work! I am really glad I found this repo. I have two AMD MI60 cards (gfx906). I will also compile this repo and share test results with you! I am specifically interested in VLLM batch/concurrent inference speeds. So far, I was not able to compile VLLM with default installations of ROCM 6.2.2 and VLLM. There is also composable_kernel based flash attention implementation here - https://github.com/ROCm/flash-attention (v2.6.3). This FA compiles fine with default ROCM 6.2.2 in Ubuntu 22.04 but exllamav2 backend with llama3 8B started generating gibberish text (exllamav2 works fine without FA2; but exllamav2 is very slow without FA2). I hope this repo fixes this gibberish text generation problem with FA2. Thanks again! |
Quick update. I did a fresh installation of Ubuntu 24.04.1 today which takes around 6.5GB SSD storage. It installs Nvidia GPU drivers by default. I assumed this repo would install AMD GPU drivers but no, it did not. Probably, this should be mentioning in README with a brief description of how to install GPU drivers. So, I installed AMD GPU drivers as follows:
Also, there were several packages missing in Ubuntu which I had to install after I saw error messages in ./install_deps.sh.
Only after that, I was able to run ./install_deps.sh without errors. Another feedback. Can you please include a global progress bar that says how many packages were built and the total number of packages remaining in terminal logs? |
ok, I want to report an error that occurred while building the source code.
Attaching the full error output Short info about my PC: I ran the following commands and they worked.
rocminfo correctly showed those two MI60 cards. hipcc and opencl examples worked without errors. Please, let me know if I need to install Cuda libraries or else, how I fix the error above. Thanks! |
@lamikr , I think the error I am seeing might be related to spack/spack#45411 but not sure how I implement the fix here. Let me know. thanks! |
Quick update. Installation is working after I remove all nvidia drivers and restart my PC.
Now, Ubuntu is using X.Org Server Nouveau drivers. |
Finally, ROCM SDK was installed on my PC after 5 hours. It takes ~90GB of space in rocm_sdk_builder, 8.5GB in the triton folder, ~2GB in the lib/x86_64-linux-gnu folder (mostly LLVM) and ~20GB in opt/rocm_sdk_612 folder. Total of 120GB of files! Is there a way to create an installable version of my current setup (all 120GB)? It is huge and time-consuming. For comparison, rocm installation from binaries takes around 30GB. |
here are the benchmark results. I think the flash attention test failed.
|
that error above is causing llama.cpp not to run any models on GPU. Let me file a bug. |
@lamikr finally got round to do the testing initially the build went smooth-ish after doing a HIP_COMPILER=clang |
Hi. thanks for the reports. The flash attention support for gfx906 would need to be implemented in aotriton. Althought I do not have the gfx906, I will start a new build for it with ubuntu 24.04 and try to reproduce the build errors. If you have some fixes, are you able to make pull request? |
hey @lamikr The build is on LinuxMint Debian Edition, if need be i can make pull requests |
I have multiple versions of it under src_projects directory
I am not sure what is causing it. Maybe the install directory /opt/rocm_sdk_612 should also be removed and then start a clean build. Lets try to reset everything and then start a fresh build.
I have not solved yet the llama.cpp error with gfx906 but trying to add more debug to next build related to that. |
I can get as far as running the HIP and CL hello worlds, but cannot run the run and save benchmarks script. -- MIGraphX is using hipRTC
in its INTERFACE_INCLUDE_DIRECTORIES. Possible reasons include:
CMake Error in src/py/CMakeLists.txt:
in its INTERFACE_INCLUDE_DIRECTORIES. Possible reasons include:
CMake Warning (dev) at /opt/rocm_sdk_612/share/rocmcmakebuildtools/cmake/ROCMTest.cmake:230 (install): RPATH entries for target 'test_verify' will not be escaped in the -- Generating done (2.5s) |
owner of a Radeon VII card, if i can help testing code to run well on it, let me know
The text was updated successfully, but these errors were encountered: