Releases: KoljaB/RealtimeTTS
v0.4.10
- new stream2sentence version 0.2.7
- bugfix for #5 (causing a whitespace between words to get lost sometimes)
- upgrade to latest NLTK and Stanza versions including new "punkt-tab" model
- allow offline environment for stanza
- adds support for async streams (preparations for async in RealtimeTTS)
- dependency upgrades to latest version (coqui tts 0.24.2 ➡️ 0.24.3, elevenlabs 1.11.0 ➡️ 1.12.1, openai 1.52.2 ➡️ 1.54.3)
- added load_balancing parameter to coqui engine
- if you have a fast machine with a realtime factor way lower than 1, we infer way faster then we need to
- this parameter allows you to infer with a rt factor closer to 1, so you will still have streaming voice inference BUT your GPU load goes down to the minimum that is needed to produce chunks in realtime
- if you do LLM inference in parallel this will be faster now because TTS takes less load
v0.4.9
v0.4.8
-
added ParlerEngine. Needs flash attention, then barely runs fast enough for realtime inference on a 4090.
Parler Installation for Windows (after installing RealtimeTTS):
pip install git+https://github.com/huggingface/parler-tts.git pip install torch==2.3.1+cu121 torchaudio==2.3.1 --index-url https://download.pytorch.org/whl/cu121 pip install https://github.com/oobabooga/flash-attention/releases/download/v2.6.3/flash_attn-2.6.3+cu122torch2.3.1cxx11abiFALSE-cp310-cp310-win_amd64.whl pip install "numpy<2"
v0.4.7
v0.4.6
v0.4.5
- upgrade to use the latest libraries of all TTS provider engines
- added some parameters to play method:
context_size_look_overhead
(int)- Default:
12
- Description: Additional context size for looking ahead when detecting sentence boundaries.
fast_sentence_fragment_allsentences
(bool)- Default:
False
- Description: When set to
True
, applies the fast sentence fragment processing to all sentences, not just the first one.
fast_sentence_fragment_allsentences_multiple
(bool)- Default:
False
- Description: When set to
True
, allows yielding multiple sentence fragments instead of just a single one.
- Default:
v0.4.4
Added:
tokenize_sentences
(callable)
- Default: None
- Description: A custom function that tokenizes sentences from the input text. You can provide your own lightweight tokenizer if you are unhappy with nltk and stanza. It should take text as a string and return split sentences as a list of strings.
and
before_sentence_synthesized
(callable)
- Default: None
- Description: Callback function that gets called before a single sentence fragment gets synthesized.
Merged PR #109 (async server to serve multiple requests in parallel, thx to Raj Hammeer Singh Hada for providing this
v0.4.2
- support for customized installations
- upgraded version numbers of dependent libraries raised to latest versions
- fixed #96: expanded python_requires to include Python 3.12 (now '>=3.9, <3.13')
❗️ Please use pip install realtimetts[all]
instead of pip install realtimetts
now.
To install RealtimeTTS with support for all TTS engines:
pip install -U realtimetts[all]
Custom Installation
RealtimeTTS allows for custom installation with minimal library installations. Here are the options available:
- all: Full installation with every engine supported.
- system: Includes system-specific TTS capabilities (e.g., pyttsx3).
- azure: Adds Azure Cognitive Services Speech support.
- elevenlabs: Includes integration with ElevenLabs API.
- openai: For OpenAI voice services.
- gtts: Google Text-to-Speech support.
- coqui: Installs the Coqui TTS engine.
- minimal: Installs only the base requirements with no engine (only needed if you want to develop an own engine)
Say you want to install RealtimeTTS only for local neuronal Coqui TTS usage, then you should use:
pip install realtimetts[coqui]
If for example you want to install RealtimeTTS with only Azure Cognitive Services Speech, ElevenLabs, and OpenAI support:
pip install realtimetts[azure,elevenlabs,openai]
Virtual Environment Installation
For those who want to perform a full installation within a virtual environment, follow these steps:
python -m venv env_realtimetts
env_realtimetts\Scripts\activate.bat
python.exe -m pip install --upgrade pip
pip install -U realtimetts[all]
v0.4.1
- added emotions to azure engine
- added speed parameter to GTTSEngine (usage see gtts_test.py)
- switched coquiengine to Idiap Research Institute's maintained fork of coqui tts (thanks!)
- bugfix for ElevenlabsEngine get_voices method