Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

does this work with amd #366

Open
s-b-repo opened this issue Mar 19, 2023 · 13 comments
Open

does this work with amd #366

s-b-repo opened this issue Mar 19, 2023 · 13 comments

Comments

@s-b-repo
Copy link

python tortoise/do_tts.py --text "hi" --voice lolitest --preset fast
Traceback (most recent call last):
File "/usr/lib/python3.10/site-packages/torch-1.13.1-py3.10-linux-x86_64.egg/torch/init.py", line 172, in _load_global_deps
ctypes.CDLL(lib_path, mode=ctypes.RTLD_GLOBAL)
File "/usr/lib/python3.10/ctypes/init.py", line 374, in init
self._handle = _dlopen(self._name, mode)
OSError: libcublas.so.11: cannot open shared object file: No such file or directory

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/run/media/cunny/437cec73-3450-44be-9f57-95eb54003e1e/tortoise-tts-main/tortoise-tts-main/tortoise/do_tts.py", line 4, in
import torch
File "/usr/lib/python3.10/site-packages/torch-1.13.1-py3.10-linux-x86_64.egg/torch/init.py", line 217, in
_load_global_deps()
File "/usr/lib/python3.10/site-packages/torch-1.13.1-py3.10-linux-x86_64.egg/torch/init.py", line 178, in _load_global_deps
_preload_cuda_deps()
File "/usr/lib/python3.10/site-packages/torch-1.13.1-py3.10-linux-x86_64.egg/torch/init.py", line 158, in _preload_cuda_deps
ctypes.CDLL(cublas_path)
File "/usr/lib/python3.10/ctypes/init.py", line 374, in init
self._handle = _dlopen(self._name, mode)
OSError: /usr/lib/python3.10/site-packages/nvidia_cuda_runtime_cu11-11.7.99-py3.10-linux-x86_64.egg/nvidia/cublas/lib/libcublas.so.11: cannot open shared object file: No such file or directory

@michaelnew
Copy link

Been running fine for me with rocm. Try this:

pip uninstall torch torchaudio -y
pip install torch torchaudio --index-url https://download.pytorch.org/whl/rocm5.4.2

@pciazynski
Copy link

@michaelnew It's great to hear it works with ROCm!

Do you mind to share a bit more how did you install it? Which AMD GPU precisely do you have?

I ask because when I try to run it - it tries to do something with my GPU (I see SCLK going up in rocm-smi), but after a while I got a segmentation fault:

python tortoise/do_tts.py --text "hi" --voice random --preset fast
Fatal Python error: Segmentation fault

Current thread 0x00007fdd9dc2a300 (most recent call first):
  File "/home/piotr/repos/tortoise-tts/tortoise/api.py", line 390 in tts
  File "/home/piotr/repos/tortoise-tts/tortoise/api.py", line 331 in tts_with_preset
  File "tortoise/do_tts.py", line 37 in <module>
[1]    43982 segmentation fault (core dumped)  python tortoise/do_tts.py --text "hi" --voice random --preset fast

Would be super nice to figure it out what exactly needs to be done to run it on AMD GPUs, and then we can update the README if it works well :)

@pciazynski
Copy link

pciazynski commented Apr 14, 2023

Oh, I've found the solution: ROCm/ROCm#1698

It works for me on my AMD Radeon RX 6700 XT with the following env variable HSA_OVERRIDE_GFX_VERSION=10.3.0, for example:

HSA_OVERRIDE_GFX_VERSION=10.3.0 python tortoise/do_tts.py --text "hi" --voice random --preset fast

@michaelnew
Copy link

@pciazynski nice, glad you got it working. I'm using a Radeon VII and I didn't run into that issue. What I posted above was all I had to do, at least from what I recall.

As a side note though, I did have to fiddle around with requirements.txt quite a bit to get it to install cleanly into a new virtual environment, so I'll share that here if anyone needs it:

tqdm
rotary_embedding_torch
transformers==4.19
tokenizers
inflect
progressbar
einops==0.4.1
unidecode
scipy==1.10.1
librosa==0.9.1
ffmpeg
threadpoolctl
appdirs
--extra-index-url https://download.pytorch.org/whl/rocm5.4.2
torchaudio
--extra-index-url https://download.pytorch.org/whl/rocm5.4.2
torch

@s-b-repo
Copy link
Author

broke my linux install

@karabaralex
Copy link

thanks a lot for this thread, I'm trying to run this on AMD Radeon Pro 580X 8 GB (Mac Pro).
I use @michaelnew requirements file from above, it seems it is working but I get this warning:

torch/amp/autocast_mode.py:204: UserWarning: User provided device_type of 'cuda', but CUDA is not available. Disabling
  warnings.warn('User provided device_type of \'cuda\', but CUDA is not available. Disabling')

and I see that GPU is not actively working, do you get the same warning?

@michaelnew
Copy link

@karabaralex no, that warning almost certainly means you aren't running on the GPU. For me it's about a 50x speedup on the GPU vs CPU, so it should be pretty obvious too when you run inference.

You can check for cuda in the python interpreter with this:

import torch
torch.cuda.is_available() # should return True

Just make sure you're using python from your virtual environment (if you're using one) rather than system python.

@manmay-nakhashi
Copy link
Collaborator

For AMD you have to look at rcom till I know cuda only works for nvidia.

@michaelnew
Copy link

It is rocm. PyTorch just refers to it as cuda.

@thatguy4194
Copy link

I cannot get this to work for the life of me. I have a 6700XT like @pciazynski , and using HSA_OVERRIDE_GFX_VERSION=10.3.0
got me through my segmentation fault bug, but I have run into a new problem. Upon running my command I am presented with
(tortoise) [ddadude@Spire tortoise-tts]$ HSA_OVERRIDE_GFX_VERSION=10.3.0 python tortoise/do_tts.py --text "hi" --voice random --preset fast Generating autoregressive samples.. 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 12/12 [00:07<00:00, 1.69it/s] Computing best candidates using CLVP 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 12/12 [00:02<00:00, 5.93it/s] Transforming autoregressive outputs into audio.. MIOpen(HIP): Error [Compile] 'hiprtcCompileProgram(prog.get(), c_options.size(), c_options.data())' naive_conv.cpp: HIPRTC_ERROR_COMPILATION (6) MIOpen(HIP): Error [BuildHip] HIPRTC status = HIPRTC_ERROR_COMPILATION (6), source file: naive_conv.cpp MIOpen(HIP): Warning [BuildHip] /tmp/comgr-7e62c6/input/naive_conv.cpp:39:10: fatal error: 'limits' file not found #include <limits> // std::numeric_limits ^~~~~~~~ 1 error generated when compiling for gfx1030. terminate called after throwing an instance of 'miopen::Exception' what(): /long_pathname_so_that_rpms_can_package_the_debug_info/data/driver/MLOpen/src/hipoc/hipoc_program.cpp:304: Code object build failed. Source: naive_conv.cpp Aborted (core dumped)

Specs:
CPU: Ryzen 7 3800X
16GB RAM
GPU: AMD Radeon RX 6700XT
OS: Linux Mint 21.1

I know very little about ROCm, HIP, and AMDGPU in general, and so I fear I am missing something very obvious here, especially because reading the errors I think I am just missing a library, but I have no idea where to get it or how to install it, and AMD docs have led me in circles. Any help is greatly appreciated, thank you all.

@thatguy4194
Copy link

An update: The version of ROCm I was using was improper. I fixed this by switching to pytorch and torchaudio ROCm 6.1.

@claydegruchy
Copy link

This is working for me but I'm only getting 6.1it/s (half the speed of a macbook laptop).

@fakerybakery
Copy link
Contributor

@claydegruchy please check to make sure it's actually running on the gpu. I ran into this issue and realized I was using a version of torch that didn't support ROCm. Check gpu usage using the radeontop tool

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants