-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Run with onnxruntime-gpu not working for faster_whisper #493
Comments
u should not have both installing the caveat with |
But can silero vad run with onnxruntime-gpu? To do that I believe I might need to change the requirements of faster whisper so it does not install onnxruntime, right? I'm running the application on docker with the following image: nvidia/cuda:11.7.1-cudnn8-runtime-ubuntu20.04, so cuda + cudnn are properly installed |
it's possible to run silero vad with idk u using what version of |
Thanks for that, phineas! Let me ask you something else. Faster_whisper's transcribe is already taking up 99% of my GPU, if I run VAD on GPU as well, would it be a problem or would it take longer due to that? I read through transcribe.py and I see that SileroVAD is only used within the transcribe function and the segments are a generator, so it should not overload the GPU. Am I correct? |
I implemented this code of yours from # 364 (comment) and it actually increased the transcribe function time, going from 2 to 7 seconds for an audio that i'm testing. Do you know why that happened? Analyzing it further I believe that happens because it creates a session everytime we call the transcribe function, so, since it is using GPU, it increases session creation time. |
hmm seem like i misread your previous comment, silero vad should work with it always create new onnx session no matter gpu or cpu, but take more time to load to gpu i guess (loading time > processing time), maybe need a longer audio to test for actual speed up |
Yes, at first I did want to run it with the onnxruntime-gpu library but using the Silero VAD on CPU, but since you posted the code, I tried running it on the GPU, but session time increases the time too much for small audios, so it's not worth it in most cases, better to use CPU with more threads active. I'm trying to run this code along with pyannote's 3.0 diarization pipeline, which requires onnxruntime-gpu, so faster_whisper's requirements were causing a conflict. I'm using a docker container in a pod with GPU orchestrated by kubernetes, there I'm building an image based on |
should had shared the config info since the beginning to avoid talking to nowhere 😅 so the actual problem is jupyter kernel crash, u have logs ? |
I'll be running some tests and I'll comeback here with the results. For now, I don't have any logs, I killed the pod before accessing them. Edit: I'll only be able to touch this issue again next week. When I get the results, I will post them here. |
Hi, I am having the same issue: I need to run |
I created a pull request that fixes this issue: #499. You can try it by importing |
your PR is very likely to be rejected, it only works with nvidia gpu, meanwhile |
Thanks for the heads up |
I don't recommend running silero vad on GPU either, since it takes longer to instantiate a session than the CPU version. For shorter audios, it increases the overall time significantly. I've had had 2s on CPU versus 7s on GPU for certain audios. Perhaps it's possible we add an option for the user to select GPU or CPU for silero vad, using the parameters class. |
So, for this issue, @phineas-pta, I fixed it by installing only onnxruntime-gpu, the jupyter notebook is working properly and everything is running as it should be. To do this, I cloned whisper repo, created a build with only the onnxruntime-gpu version and installed it, now everything is running normally. Thanks for the help. |
@guilhermehge yes I did the same at works! Maybe we could create a fork |
It seems that the current pyannote version (3.0.1) is not working with the current faster_whisper version. Any idea solution on that? |
It is. I am using it atm. How are you implementing it? Docker? Colab? Locally w/o docker? |
Locally. should be the problem I guess. Wanted to update whisperX on that matter |
Did you create a virtual environment to do that? Can you further explain your problem so we can debug it? |
Its is a local env made with conda, with m2 silicon.
|
@remic33 pyannote dont officially support mac, there's already many issues on pyannote repo about that |
It worked previously, I know it because I was using it and I was part of those discussions. You just needed to add some packages. But maybe with onnx gpu it do not anymore. |
I am trying to use faster_whisper with pyannote for speech overlap detection and speaker diarization, but the pyannote's new update 3.0.0, it will need onnxruntime-gpu to run the diarization pipeline with the new embedding model.
Installing both onnxruntime (from faster_whisper) and onnxruntime-gpu (from pyannote), causes a conflict and onnx redirects to CPU only.
I tried uninstalling onnxruntime and forcing the reinstall of onnxruntime-gpu and faster_whisper is no longer working.
Is it possible to use onnxruntime-gpu for faster_whisper?
The text was updated successfully, but these errors were encountered: