Skip to content

Latest commit

 

History

History
28 lines (18 loc) · 1.07 KB

README.md

File metadata and controls

28 lines (18 loc) · 1.07 KB

ELSubs

Using the elsubs.sh script, one can extract subtitles from videos in the Easy Languages format.

Installation

Make sure you've the following tools installed and available on your PATH variable:

  • yt-dlp
  • ffmpeg
  • convert (part of ImageMagik)
  • tesseract
  • python3

Afterwards, put elsubs.sh, dedupe.py, and sentencify.py on your $PATH.

Usage

To extract subtitles from an Easy Languages video, you'll first need its ID. The ID is part of the video's URL: https://www.youtube.com/watch?v=ID. Next, you need to find the language code for the primary language of the video (the language the video is attempting to teach).

Finally, run elsubs.sh with the ID passed as the first argument and the language code as the second. An example run is shown below:

$ elsubs.sh AdYiOoBzhHI tur
...

After the script finishes, a new directory will have been created. Inside the directory you'll find two text files, {ID}.top.txt containing text in the primary language and {ID}.bottom.txt containing text in English.