An American Youtube Mentorship, for a French guy #1302

BlockSats · 2023-05-01T16:56:55Z

BlockSats
May 1, 2023

Hi, I'm a user, tinkerer, but not necessarily experienced.
Thank you for reading me.
The use of Whisper came because I was following a training on youtube in American language, and I'm French, with simple background. So I started by trying Whisper on my PC, but it was long, very long...each video lasting an hour on average.
That's when I discovered Google Colab, with the GPU.
(tell me if it's good or if I should look into Jupyter Notebook or something else...)
I then found a ready-made notebook ( whisper_youtube.ipynb from ArthurFDLR on Github). I started using it, then tried to modify it by adding yt-dlp, then ffmpeg, etc.. but my skills are limited, I repeat.

My project is to translate each video, then add the new subtitles in french. I guess .SRT is the right solution? Any other ideas?

Then the second part would be to use a "TTS Engine" and thus modify the soundtrack. What would be great would be if I could train a voice of my choice.
I know that one of the longest part of the project is the translation, I don't want to use the translation provided by google or deepl without proofreading it.
So I thought that to get the best translation possible, the soundtrack of the video must be of the best quality, then the settings of Whisper must be the best possible. I tried with FP16 and without but no difference, I increased "Number of beams [...] is zero" and there appeared differences, not in the transcription but in the timestamp.
So I would like to ask you where you would advise me to read to understand how each setting works in simple language.
If you have any work or ideas to share, I'm interested.
I have other questions about time stamps, silences, etc., which may not be the same in each language. If it's a project you're interested in, don't hesitate to come in private, I have a semi private Discord where I try to store as much info as possible and to work in collaboration.
I also tried to use Deepl's API, but it costs me 1€ per document translation, it's not cheap, it's a lot of questions, but I'm motivated.

phineas-pta · 2023-05-02T12:28:49Z

phineas-pta
May 2, 2023

.srt is fine, except if you have other specific needs

there are plenty free translation models using huggingface's transformers, you can try out if you don't want DeepL

also there is no best setting for whisper, if you are a beginner then default settings work fine, there is a setting called initial_prompt that allow you to add new vocabulary (e.g. people name) see #1268, also the fp16 doesn't really improve anything see #622

to improve audio quality, you can try SileroVAD to remove silence, Demucs to extract vocals, you can inspire from https://github.com/EtienneAb3d/WhisperHallu

for timestamps you should fix it manually, for now

0 replies

BlockSats · 2023-05-07T11:42:12Z

BlockSats
May 7, 2023
Author

thank you for your answer.

I did a test and it worked fine.

To add an output in srt I added :

loadModel("0",modelSize="large")

result = transcribePrompt(path=path, lng=lng, prompt=prompt, addSRT=True)

But I'm not sure if this is the way I should do it, because it didn't output a .srt file.

I copied and pasted the output.

I would also like to know what you think about the model. From what I understand, the script uses the large template.

To transcribe from American, isn't it better to use an .en template? and which one? knowing that using google colab, I don't have any RAM or GPU worries.

Can you tell me where I have to make changes.

In the 30 lines script or in the file transcribeHallu.py ?

And is it possible to add a translation of the .srt file into French directly?

How do I translate a .srt file (edited: without losing the time code)
For a 1h video for example, it is tedious to take only the text and translate it. Do you have any advice?

4 replies

BlockSats May 7, 2023
Author

I just tried the text in a .srt file but the text is out of sync with the sound

BlockSats May 7, 2023
Author

I tried this :
https://editingtools.io/translate/
I tried this but it's about 0.50ct€ to 5€ per srt file.
Knowing that I have 41 .srt file translations to do that is pretty expensive.
I also tried the deepl api but in addition to the subscription, I'm at 1€ per call of the API

BlockSats May 7, 2023
Author

Otherwise there are free online tools but I am not really sure of the result
https://subtitlestranslator.com/en/edit.php

phineas-pta May 7, 2023

there is no large.en, you can try out medium.en if you want, the differences between those models are speed & accuracy (read the paper for more details)

as i said, you have to fix the timestamps yourself, although for my use case the mismatch is acceptable

again if you don't want to use paid service, there're plenty free translation models using huggingface's transformers

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

An American Youtube Mentorship, for a French guy #1302

{{title}}

Replies: 2 comments 4 replies

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

An American Youtube Mentorship, for a French guy #1302

BlockSats May 1, 2023

Replies: 2 comments · 4 replies

phineas-pta May 2, 2023

BlockSats May 7, 2023 Author

BlockSats May 7, 2023 Author

BlockSats May 7, 2023 Author

BlockSats May 7, 2023 Author

phineas-pta May 7, 2023

BlockSats
May 1, 2023

Replies: 2 comments 4 replies

phineas-pta
May 2, 2023

BlockSats
May 7, 2023
Author

BlockSats May 7, 2023
Author

BlockSats May 7, 2023
Author

BlockSats May 7, 2023
Author