Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RealtimeSTT on AMD Guide #107

Open
TheTrustedComputer opened this issue Aug 25, 2024 · 0 comments
Open

RealtimeSTT on AMD Guide #107

TheTrustedComputer opened this issue Aug 25, 2024 · 0 comments

Comments

@TheTrustedComputer
Copy link

TheTrustedComputer commented Aug 25, 2024

Below is a guide to running RealtimeSTT on AMD GPUs. Most of the time, building/replacing PyTorch and ONNX Runtime with their ROCm versions will work. However, this will not be enough as CTranslate2 also needs to be rebuilt for ROCm. Unfortunately, it is not officially supported, but someone has forked it to support these cards: https://github.com/arlo-phoenix/CTranslate2-rocm

Follow the build steps from the link above. You can optionally disable OpenMP and use another BLAS library like OpenBLAS. Then, install the ROCm build of CTranslate2 with pip and test RealtimeSTT. On my 8GB 5500 XT, it seemed to function but is really unusable; I got loads of out-of-memory errors, even on the tiny model with the beam search size set to 1.

2024-08-25 07:44:44,201 root [ERROR] - Unhandled exeption in _realtime_worker: CUDA failed with error out of memory
RealTimeSTT: root - ERROR - Unhandled exeption in _realtime_worker: CUDA failed with error out of memory
Exception in thread Thread-5 (_realtime_worker):
Traceback (most recent call last):
  File "/home/thetrustedcontainer/.python-3.11/lib/python3.11/threading.py", line 1045, in _bootstrap_inner
    self.run()
  File "/home/thetrustedcontainer/.python-3.11/lib/python3.11/threading.py", line 982, in run
    self._target(*self._args, **self._kwargs)
  File "/home/thetrustedcontainer/software/.RealtimeSTT-venv/lib/python3.11/site-packages/RealtimeSTT/audio_recorder.py", line 1496, in _realtime_worker
    segments, info = self.realtime_model_type.transcribe(
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/thetrustedcontainer/software/.RealtimeSTT-venv/lib/python3.11/site-packages/faster_whisper/transcribe.py", line 397, in transcribe
    encoder_output = self.encode(segment)
                     ^^^^^^^^^^^^^^^^^^^^
  File "/home/thetrustedcontainer/software/.RealtimeSTT-venv/lib/python3.11/site-packages/faster_whisper/transcribe.py", line 838, in encode
    return self.model.encode(features, to_cpu=to_cpu)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: CUDA failed with error out of memory

I had to stick with the CPU version that is slower yet sufficient for my use case. Nevertheless, I hope this guide will help other users with AMD GPUs get RealtimeSTT running on their cards. As with any unsupported hardware, your mileage may vary.

Related: #7 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant