Benchmarking Results #530
Replies: 9 comments 10 replies
-
Are you using "beam search" or "greedy search"? It has an impact on the performance. I have an old 1070 Ti GPU. A 2-hour podcast takes around 60 minutes to transcribe (medium.en model). I am willing to upgrade, so I have made the following research on GPU performance. As far as I know, the "FP16 performance" is what determines the speed of the transcription. @jongwook - please correct me if I am wrong. These are my theoretical values. The 4090 seems to be a beast when it comes to machine learning. I would be happy if someone can share their benchmarking values for other GPUs as well.
|
Beta Was this translation helpful? Give feedback.
-
I have GTX 3060 12 gb and it can transcribe large model you can watch my video and read the comments here : How Good is RTX 3060 for ML AI Deep Learning Tasks and Comparison With GTX 1050 Ti and i7 10700F CPU> https://www.youtube.com/watch?v=q8Q8CCDdSKo |
Beta Was this translation helpful? Give feedback.
-
I am wondering how many parallel audio streams can approximately transcribe Nvidia A100, H100 or Geforce 4090 in realtime (tiny and small model). |
Beta Was this translation helpful? Give feedback.
-
A100 (PCIe Version/250W cap)micros
Inputs:
Format:
podcasts
uab-warzone_mixdown.mp3 --language en --model … # 1:22:5h
swr2wissen-20221230-tee-in-der-weltgeschichte-22-teekriege-und-die-macht-der-tee-nationen.m.mp3 --language de --model … # 0:30:58h
|
Beta Was this translation helpful? Give feedback.
-
@commonism A100 could potentially transcribe 5 or 6 audio sources in parallel (in realtime) (medium model)? |
Beta Was this translation helpful? Give feedback.
-
Using the A100 80G cards - very likely. |
Beta Was this translation helpful? Give feedback.
-
@MetaAnomie What are the units on the chart for transcription time? |
Beta Was this translation helpful? Give feedback.
-
Any tests for AMD Radeon PRO W7000 Series (https://www.amd.com/en/graphics/workstations) and ROCm 7? |
Beta Was this translation helpful? Give feedback.
-
I've been experimenting myself with a 3060 Ti (8GB VRAM, "medium" model) for inference, and i7-13700k for preprocessing as follows: I needed it to do better. There must be a way to make it better. No idea why it took longer after ffmpeg preprocessing, even though the preprocessing time was 5x faster. edit: without preprocessing the inference time is 627.51 sec for that 1:30:00 long file. |
Beta Was this translation helpful? Give feedback.
-
Been performing some performance benchmarking recently so just attaching the results here if they can be of any reference to anyone. Will include more in the future as I go.
Beta Was this translation helpful? Give feedback.
All reactions