WhisperCLI: Audio Transcription Tool

WhisperCLI is a powerful command-line tool for transcribing audio and video files based on the Whisper model by OpenAI. This tool allows you to easily convert speech to text with support for multiple languages and the ability to process specific segments of a file.

Features

Support for various audio and video file formats (MP3, MP4, WAV, etc.)
Choice of Whisper model (tiny, base, small, medium, large)
Transcription of specific time intervals
Support for multiple languages
Silent mode with an option for detailed logs
Save results to a file
Simple command-line interface

Requirements

Python 3.7+
PyTorch
Transformers
Pydub
NumPy

Installation

Clone the repository:

git clone https://github.com/Hole-code/WhisperCLI.git

cd WhisperCLI

Install the dependencies:

pip install -r requirements.txt

System Installation

To use WhisperCLI as a system utility, follow these steps:

Create a whispercli file with the following content:

#!/bin/bash
python3 /full/path/to/whisper_cli.py "$@"

Replace /full/path/to/whisper_cli.py with the actual path to your script.

Make the file executable:

chmod +x whispercli

Move the file to a directory that is in the system PATH:

sudo mv whispercli /usr/local/bin/

Now you can run the utility from any directory by simply typing whispercli:

whispercli -n path/to/audio_file.mp3

Note: Make sure you have administrative rights to create a symbolic link in the system directory.

Usage

Basic usage:

whispercli -n path/to/your/audio_file.mp3

Parameters

-n, --name: Path to the audio file (required)
-s, --start: Start time for transcription (format: SS, MM:SS, or HH:MM:SS)
-e, --end: End time for transcription (format: SS, MM:SS, or HH:MM:SS)
-l, --language: Expected language of the audio (e.g., 'en' for English)
-m, --model: Whisper model size (tiny, base, small, medium, large)
-o, --output: Output file name to save the transcription
-v, --verbose: Output detailed logs

Examples

Transcribe the entire file:

whispercli -n audio.mp3

Transcribe a specific segment:

whispercli -n video.mp4 -s 5:30 -e 10:00

Specify language and model:

whispercli -n audio.wav -l en -m medium

Save the result to a file:

whispercli -n audio.mp3 -o transcription.txt

Use all options with detailed logs:

whispercli -n long_video.mp4 -s 1:00:00 -e 1:30:00 -l en -m large -o result.txt -v

Notes

Working with MP3 files may require installing ffmpeg.
Large Whisper models may require significant computational resources.
On the first run, the script will download the selected Whisper model, which may take some time.

Contributing

We welcome contributions! If you have ideas for improvements or have found a bug, please create an issue or submit a pull request.

License

This project is licensed under the MIT License. For more details, see the LICENSE file.

Acknowledgments

OpenAI for creating the Whisper model
The developers of PyTorch, Transformers, and Pydub libraries

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
LICENSE		LICENSE
README.md		README.md
README.ru.md		README.ru.md
requirements.txt		requirements.txt
whisper_cli.py		whisper_cli.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

WhisperCLI: Audio Transcription Tool

Features

Requirements

Installation

System Installation

Usage

Parameters

Examples

Notes

Contributing

License

Acknowledgments

About

Releases

Packages

Languages

License

Hole-code/WhisperCLI

Folders and files

Latest commit

History

Repository files navigation

WhisperCLI: Audio Transcription Tool

Features

Requirements

Installation

System Installation

Usage

Parameters

Examples

Notes

Contributing

License

Acknowledgments

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages