A collection of scripts to help me manage my media library and maintain the file structure required by Jellyfin (and possibly also Emby/Plex).
In addition to the modules in requirements.txt, these should be installed and added to your PATH:
- ffmpeg
- flac
For ripping image-based subtitles with pgsrip
(optional):
- mkvtoolnix-gui
- tesseract
tesseract-data
for the languages you want to use Note that pgsrip is currently not implemented but can be run directly in the venv terminal:pgsrip -all -f -l en -l fr ~/medias/
Some scripts use spacy for natural language processing and models need to be downloaded separately for each language https://spacy.io/models
python -m spacy download en_core_web_trf # English
python -m spacy download fr_dep_news_trf # French
Note that transformer (trf) models are highly recommended to get acceptable results. Those models require PyTorch, which may take a while to be available on the latest version of Python. In case of installation issues, revert to a previous version of Python.
The scripts are sorted into folder depending on the type of files they act on
-
audio
reencode_flac
: Reencodes Flac files in place if they have been encoded with older versions of FLAC
-
fs (file system)
-
FileList
: Class for easily getting files of certain types from the file system and performing tasks on them.FileSet
: Class for quick, reusable, in-memory name-based file search, and folder comparison using set operations.find_empty_dirs
: Finds directories that are empty and optionally removes them. Can also ignore small or hidden files.
-
video
clean_subs
: Removes common advertisement strings as well as formatting tags from subtitle files. Also attempts to detect ALL-CAPS subtitles and to apply grammatically correct capitalization.extract_subs
: Extracts text based subtitles from video files and saves them as .srt.match_subs
: Renames and relocate subtitles to match their associated video file and comply with tagging standards. Also finds dangling subtitle files.
-
shared: functions meant to be used by other scripts and not by the end user
The modules can be imported as follows:
from video.extract_subs import extract_subs
Add the path to media-library-helper to a PYTHONPATH
environment variable in Windows, and then you can run with, for example:
python -m audio.reencode_flac arguments
Add the following environment variable to see output in real time
PYTHONUNBUFFERED=1