Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why does it take so long to process a 1 minute video? #110

Closed
Root-FTW opened this issue Aug 16, 2023 · 3 comments
Closed

Why does it take so long to process a 1 minute video? #110

Root-FTW opened this issue Aug 16, 2023 · 3 comments

Comments

@Root-FTW
Copy link

How can I make it run faster? It takes too long for a 1 minute video.

image

I am using this CLI command:
whisper_timestamped --accurate video.mp4 --model large-v1 --output_format srt --vad False --device "cuda:0" --output_dir .

My PC:

Windows 11 PRO
GPU: GTX 1080 TI
CPU: i9-9900k

@Jeronymous
Copy link
Member

How much does it take? dozens of minutes?

That is weird. Is there a chance you can share the video?
Is the transcription (almost) correct?

@IntendedConsequence
Copy link

Something doesn't feel right. When I run with --model large-v3, VRAM shoots to 10GB and it's all very slow. And yet, running the large-v3 model in whisper.cpp, and using vanilla transformers pipeline (in this notebook https://huggingface.co/spaces/hf-audio/whisper-large-v3/blob/main/whisper_notebook.ipynb), both of them only use about 4GB VRAM, and are noticeably faster.

@Jeronymous
Copy link
Member

Thanks for spotting @IntendedConsequence

Unfortunately the issue of high VRAM consumption happens in the openai-whisper package itself 😬
openai/whisper#1670 here it's reported from version 20230918, but I tried previous versions (up to 20230124) and experience the same: openai-whisper always hits ~10GB VRAM for large models.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants