New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Add transcriber tool #851

Merged

nshmyrev merged 18 commits into alphacep:master from vadimdddd:add_transcriber

Apr 20, 2022

Contributor

vadimdddd commented Feb 11, 2022 •

edited

Loading

Transcriber allows you transcribe your audiofile. The project contain ffmpeg transform that allows you transcribe all ffmpeg formats. Also in result data you can see time of transcription and xRT. Also in setup.py was added ability to run the aligner not only from the folder with it was added.
How to work:

Run the script(example 1 file): python3 vosk_transcriber.py moon.wav
You can get result in terminal or choose output file like txt or srt, it will look like:
python3 vosk_transcriber.py moon.wav -output moon.txt
and key -otype srt(default -otype is txt)
Run script(example file folder):
python3 vosk_transcriber.py ~/file_folder -output ~/results
In result you will get output folder with transcribed files in txt or srt format
To see available models -list_models;
Model vosk-model-small-en-us-0.15(https://alphacephei.com/vosk/models/vosk-model-small-en-us-0.15.zip) as
default model
If you wanna choose another model there are 2 ways:
a) -lang 'language' (available languages 'lang' - more info: https://alphacephei.com/vosk/models/model-list.json). It will load the smallest model for the given language.
b) -model_name 'name' of json file above. Example: -model_name vosk-model-small-tr-0.3

vadimdddd added 2 commits

February 11, 2022 21:27


          add_transcriber

f5653cc


          directory processing was added: method process_dirs in transcriber.py…

35219ac

… and isdir condition with mutiple processing in vosk_transcriber.py

ls-milkyway commented Feb 22, 2022

Can the command python3 vosk_transcriber.py . -i cats.wav -o -o cats.txt generate subtitle file instead of text file (i.e. cats.srt rather than cats.txt)...If not, then plz modify the pull request by converting it to draft..& include srt file generation code..thanks

Collaborator

nshmyrev commented Feb 22, 2022

@ls-milkyway yes, srt option gonna be there.

nshmyrev changed the title ~~add_transcriber~~ Add transcriber tool

nshmyrev mentioned this pull request

Support for model downloader UI #871

Open

nshmyrev reviewed

View reviewed changes

python/transcriber/vosk_transcriber.py Outdated Show resolved Hide resolved

python/transcriber/vosk_transcriber.py Outdated Show resolved Hide resolved

python/transcriber/vosk_transcriber.py Outdated Show resolved Hide resolved

python/transcriber/vosk_transcriber.py Outdated Show resolved Hide resolved

python/transcriber/transcriber.py Outdated Show resolved Hide resolved


          added new example moon.wav instead of cats.wav; added srt transcribe

354afa3

Contributor Author

vadimdddd commented Mar 9, 2022

@ls-milkyway, Hello, test the program please, need feedback :)

vadimdddd added 3 commits

March 14, 2022 17:14


          added models_list arg; added auto models check(download and launch)

da7d9f9


          added method get_model for 2 optional agrs model_name and lang

32b9832


          changed os to Path methods; rewrited lists loops

af6aa62

nshmyrev requested changes

View reviewed changes

python/transcriber/vosk_transcriber.py Outdated Show resolved Hide resolved

python/transcriber/vosk_transcriber.py Outdated Show resolved Hide resolved

python/transcriber/vosk_transcriber.py Outdated Show resolved Hide resolved

python/transcriber/vosk_transcriber.py Outdated Show resolved Hide resolved

python/transcriber/transcriber.py Show resolved Hide resolved

python/transcriber/transcriber.py Outdated Show resolved Hide resolved

python/transcriber/transcriber.py Outdated Show resolved Hide resolved

python/transcriber/transcriber.py Outdated Show resolved Hide resolved

python/transcriber/transcriber.py Outdated Show resolved Hide resolved

python/transcriber/transcriber.py Outdated Show resolved Hide resolved


          corrected remarks: naming, changed area of variables using, declared …

1f05e70

…some variables as constans, changed spot of calculate(old)-get_result(new) method inside of main, changed type transmitted args between scripts and methods, changed function from print to log for output script info(ececution time and xRT)

nshmyrev reviewed

View reviewed changes

python/transcriber/transcriber.py Show resolved Hide resolved

python/transcriber/transcriber.py Outdated Show resolved Hide resolved

python/transcriber/transcriber.py Outdated Show resolved Hide resolved

python/transcriber/transcriber.py Outdated Show resolved Hide resolved

python/transcriber/transcriber.py Outdated Show resolved Hide resolved

python/transcriber/transcriber.py Outdated Show resolved Hide resolved

python/transcriber/vosk_transcriber.py Show resolved Hide resolved

python/transcriber/vosk_transcriber.py Outdated Show resolved Hide resolved

python/transcriber/vosk_transcriber.py Outdated Show resolved Hide resolved


          changed names of some variables and methods; vosk_transcriber.py was …

522a7e4

…added into bin script with setup.py machinery; some algorithm mistakes were fixed

nshmyrev requested changes

View reviewed changes

python/transcriber/transcriber.py Outdated Show resolved Hide resolved

python/transcriber/transcriber.py Outdated Show resolved Hide resolved

python/transcriber/vosk_transcriber.py Outdated Show resolved Hide resolved

python/transcriber/vosk_transcriber.py Outdated Show resolved Hide resolved

python/transcriber/transcriber.py Outdated Show resolved Hide resolved

python/transcriber/transcriber.py Outdated Show resolved Hide resolved

python/transcriber/transcriber.py Outdated Show resolved Hide resolved

vadimdddd added 3 commits

April 11, 2022 23:13


          fixed mistakes; changed get_model method algorithm

91e8a27


          fixed range(len()) in some loops

fbd4902


          fixed range(len()) in rest loops; reworked part of the program with p…

f27f834

…rocessing args.lang and args.model_name

nshmyrev reviewed

View reviewed changes

python/setup.py Outdated Show resolved Hide resolved

python/transcriber/vosk_transcriber.py Outdated Show resolved Hide resolved

python/transcriber/vosk_transcriber.py Outdated Show resolved Hide resolved

python/transcriber/vosk_transcriber.py Outdated Show resolved Hide resolved

python/transcriber/transcriber.py Outdated Show resolved Hide resolved

python/transcriber/transcriber.py Outdated Show resolved Hide resolved

nshmyrev reviewed

View reviewed changes

python/transcriber/vosk_transcriber.py Show resolved Hide resolved

nshmyrev reviewed

View reviewed changes

python/transcriber/transcriber.py Show resolved Hide resolved

nshmyrev reviewed

View reviewed changes

python/transcriber/transcriber.py Outdated Show resolved Hide resolved

vadimdddd added 3 commits

April 18, 2022 16:13


          renamed some values and fixed name mistakes; simplefied get_file_list…

c0a5a00

… method; reworked input args types for files or folders simultineously


          delete moon.wav

19fb408


          OOP correction; changed log msgs; added log msg if I/O paths do not e…

3f8ec0a

…xist

nshmyrev requested changes

View reviewed changes

python/transcriber/vosk_transcriber.py Outdated Show resolved Hide resolved

python/transcriber/transcriber.py Outdated Show resolved Hide resolved

python/transcriber/transcriber.py Outdated Show resolved Hide resolved

python/transcriber/transcriber.py Outdated Show resolved Hide resolved

python/transcriber/transcriber.py Outdated Show resolved Hide resolved

python/transcriber/transcriber.py Outdated Show resolved Hide resolved

python/transcriber/vosk_transcriber.py Outdated Show resolved Hide resolved

python/transcriber/transcriber.py Outdated Show resolved Hide resolved

python/transcriber/transcriber.py Outdated Show resolved Hide resolved

python/transcriber/vosk_transcriber.py Outdated Show resolved Hide resolved

nshmyrev reviewed

View reviewed changes

python/transcriber/vosk_transcriber.py Show resolved Hide resolved


          avoided embedded methods; renamed values and methods; changed get_fil…

e24c10d

…e_list and get_list_languages methods with using set; changed log msgs; moved download_model method inside get_model method; added error msgs for non-existend args


          fix mistake

c1f8995

nshmyrev requested changes

View reviewed changes

python/transcriber/transcriber.py Outdated Show resolved Hide resolved

python/transcriber/transcriber.py Show resolved Hide resolved

python/transcriber/vosk_transcriber.py Outdated Show resolved Hide resolved

python/transcriber/vosk_transcriber.py Outdated Show resolved Hide resolved

python/transcriber/vosk_transcriber.py Outdated Show resolved Hide resolved

python/transcriber/vosk_transcriber.py Outdated Show resolved Hide resolved

python/transcriber/vosk_transcriber.py Outdated Show resolved Hide resolved

python/transcriber/transcriber.py Outdated Show resolved Hide resolved

python/transcriber/transcriber.py Outdated Show resolved Hide resolved

python/transcriber/transcriber.py Outdated Show resolved Hide resolved

vadimdddd added 2 commits

April 20, 2022 10:09


          var names changed; method names changed; corrected grammatical errors

ee3a850


          refactoring log msg; renamed methods

143f24c

nshmyrev merged commit 9d94746 into alphacep:master

vadimdddd deleted the add_transcriber branch

July 6, 2022 13:28

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet