GitHub - DhivehiAI/DV-Subs: [demo] Simple automated subtitle generator for Dhivehi

DV Subs - Simple video subtitle generator for Dhivehi

This is a simple demonstration of a use case for an ASR toolchain, such as the Hugging Face wav2vec2 model mentioned on https://dhivehi.ai/docs/technologies/stt/

The tutorial is inspired by this article published towardsdatascience.com. For more in-depth reading about the process, please refer to it. This demo borrows a lot of code from it.

Additionally, for a manual walk-through, a tutorial notebook is included.

The process follows a few basic steps:

Extract audio from the video
Download STT pretrained model and setup inference pipeline
Run STT on the audio to transcribe the audio
generate a .srt file containing subtitles with timestamps

Setup

Clone the repo
Install requirements pip install -r requirements.txt

Usage

The transcriber script requires an audio file as input.
You can use the provided audio_extract.py to extract audio from an input video file, and run further pre-processing on the file. Ex: run it through RNNoise or Spleeter.

Afterwards, run dv_subs.py with the following arguments

The script uses pyAudioAnalysis to segment audio into more manageable lengths by running a silence detection routine. For better results, you might want to play around with the silence_window and silence_weight options, until the segmented audio looks good.

usage: dv_subs.py [-h] [--model_dir MODEL_DIR] [--temp_dir TEMP_DIR] [--silence_window SILENCE_WINDOW] [--silence_weight SILENCE_WEIGHT] input output

positional arguments:
  input                 Input audio file name
  output                Output file name

optional arguments:
  -h, --help            show this help message and exit
  --model_dir MODEL_DIR
                        STT model files directory
  --temp_dir TEMP_DIR   Temp files directory
  --silence_window SILENCE_WINDOW
                        Audio smoothing window
  --silence_weight SILENCE_WEIGHT
                        Audio silence probabilistic weight

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
noteboooks		noteboooks
.gitignore		.gitignore
README.md		README.md
audio_extract.py		audio_extract.py
dv_subs.py		dv_subs.py
requirements.txt		requirements.txt
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DV Subs - Simple video subtitle generator for Dhivehi

Setup

Usage

About

Releases

Packages

Languages

DhivehiAI/DV-Subs

Folders and files

Latest commit

History

Repository files navigation

DV Subs - Simple video subtitle generator for Dhivehi

Setup

Usage

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages