Klingon Transcribe is a Python library for transcribing, diarizing, and enhancing audio files. It provides a CLI, Python API, and FastAPI server interface.
- Storage Read Modules: Local File, HTTP URL, S3 URL
- Storage Write Modules: Local File, HTTP URL, S3 URL
- Output Formats: Plain Text, Plain Text (Timecoded), Plain Text (Timecoded + Speaker Attributed), SRT, SRT (Speaker Attributed)
- Preprocessing: Audio Enhancement, Audio Super Enhancement, Noise Removal, Noise Removal (HiFi)
- Core: Speech to Text, Diarization, Diarization (Telephony)
pip install -r requirements.txt
klingon_transcribe run --input s3://bucket/input.wav --output local://output/
klingon_transcribe run --input http://example.com/input.wav --output s3://bucket/output/ --asr-model Citrinet-1024 --diarization-model speakerdiar_telephony --preprocess audio_enhancement noise_removal
klingon_transcribe server
Start the server:
klingon_transcribe server
Send a POST request to `http://localhost:8000/transcribe/\` with the audio file.
make test
make lint
make clean
make build
make docs