Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New transcription implementation using Whisper #1823

Closed
lfcnassif opened this issue Aug 18, 2023 · 9 comments · Fixed by #2165
Closed

New transcription implementation using Whisper #1823

lfcnassif opened this issue Aug 18, 2023 · 9 comments · Fixed by #2165
Assignees

Comments

@lfcnassif
Copy link
Member

As asked on #1335, we can offer Whisper for users and they can decide if they will pay the performance cost or not. Still not sure which would be better: Faster-Whisper or Whisper-JAX.

@lfcnassif
Copy link
Member Author

I have some sad news regarding Whisper-JAX, I managed to run it on Linux. Unfortunately it took a bit more than 4h to transcribe my 29h test data set using Whisper medium model and running on one RTX3090. It also used a ton of GPU memory, about 19GB to load the medium model, while standard Whisper uses about 11GB and Faster-Whisper about 5GB, both for the larger model. Faster-Whisper took about 3h to do the same job using the medium model.

So, given the much higher memory usage and a bit slower performance of Whisper-JAX, at least on the hardware we have, Faster-Whisper seems a better option.

@lfcnassif
Copy link
Member Author

PS: JAX support on Windows is also experimental and CPU only.

@joasource
Copy link

You probably already know, but Whisper runs very smoothly with PyTorch using CUDA 11.6. In fact, the best GUI implementation I've seen is this one: https://grisk.itch.io/whisper-gui.

I'm eagerly awaiting Whisper on IPED.

@lfcnassif
Copy link
Member Author

lfcnassif commented Sep 29, 2023

We plan to integrate Whisper in version 4.2.0, to be released in some months. If you can't wait, there is a starting draft code here:
#1335 (comment)

@rafael844
Copy link

I tested this whisper-gui and it's surprisingly fast, but I don't think the source is open source.

@lfcnassif lfcnassif self-assigned this Apr 12, 2024
@lfcnassif
Copy link
Member Author

Starting to work on this...

lfcnassif added a commit that referenced this issue Apr 13, 2024
lfcnassif added a commit that referenced this issue Apr 13, 2024
@joasource
Copy link

Starting to work on this...

Wonderful. Any release date forecast?

@lfcnassif
Copy link
Member Author

Wonderful. Any release date forecast?

Hopefully next month.

@lfcnassif
Copy link
Member Author

For those interested, a snapshot with this feature will be created here in a few minutes:
https://github.com/sepinf-inc/IPED/actions/runs/9238362650

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

3 participants