Skip to content

Releases: KoljaB/RealtimeTTS

v0.4.10

07 Nov 14:17
Compare
Choose a tag to compare
  • new stream2sentence version 0.2.7
    • bugfix for #5 (causing a whitespace between words to get lost sometimes)
    • upgrade to latest NLTK and Stanza versions including new "punkt-tab" model
    • allow offline environment for stanza
    • adds support for async streams (preparations for async in RealtimeTTS)
  • dependency upgrades to latest version (coqui tts 0.24.2 ➡️ 0.24.3, elevenlabs 1.11.0 ➡️ 1.12.1, openai 1.52.2 ➡️ 1.54.3)
  • added load_balancing parameter to coqui engine
    • if you have a fast machine with a realtime factor way lower than 1, we infer way faster then we need to
    • this parameter allows you to infer with a rt factor closer to 1, so you will still have streaming voice inference BUT your GPU load goes down to the minimum that is needed to produce chunks in realtime
    • if you do LLM inference in parallel this will be faster now because TTS takes less load

v0.4.9

01 Nov 16:17
Compare
Choose a tag to compare
  • added print_realtime_factor to CoquiEngine
  • removed a debug message that somehow made it to pypi

v0.4.8

29 Oct 17:00
Compare
Choose a tag to compare
  • added ParlerEngine. Needs flash attention, then barely runs fast enough for realtime inference on a 4090.

    Parler Installation for Windows (after installing RealtimeTTS):

    pip install git+https://github.com/huggingface/parler-tts.git
    pip install torch==2.3.1+cu121 torchaudio==2.3.1 --index-url https://download.pytorch.org/whl/cu121
    pip install https://github.com/oobabooga/flash-attention/releases/download/v2.6.3/flash_attn-2.6.3+cu122torch2.3.1cxx11abiFALSE-cp310-cp310-win_amd64.whl
    pip install "numpy<2"

v0.4.7

10 Oct 09:36
9ee7aea
Compare
Choose a tag to compare
  • updated requirements.txt, minor Readme updates

v0.4.6

04 Oct 15:12
Compare
Choose a tag to compare

v0.4.5

21 Jul 15:45
Compare
Choose a tag to compare
  • upgrade to use the latest libraries of all TTS provider engines
  • added some parameters to play method:
    context_size_look_overhead (int)
    • Default: 12
    • Description: Additional context size for looking ahead when detecting sentence boundaries.
    fast_sentence_fragment_allsentences (bool)
    • Default: False
    • Description: When set to True, applies the fast sentence fragment processing to all sentences, not just the first one.
    fast_sentence_fragment_allsentences_multiple (bool)
    • Default: False
    • Description: When set to True, allows yielding multiple sentence fragments instead of just a single one.

v0.4.4

18 Jul 14:49
Compare
Choose a tag to compare

Added:

tokenize_sentences (callable)
- Default: None
- Description: A custom function that tokenizes sentences from the input text. You can provide your own lightweight tokenizer if you are unhappy with nltk and stanza. It should take text as a string and return split sentences as a list of strings.

and

before_sentence_synthesized (callable)
- Default: None
- Description: Callback function that gets called before a single sentence fragment gets synthesized.

Merged PR #109 (async server to serve multiple requests in parallel, thx to Raj Hammeer Singh Hada for providing this

v0.4.2

04 Jul 14:04
Compare
Choose a tag to compare
  • support for customized installations
  • upgraded version numbers of dependent libraries raised to latest versions
  • fixed #96: expanded python_requires to include Python 3.12 (now '>=3.9, <3.13')

❗️ Please use pip install realtimetts[all] instead of pip install realtimetts now.

To install RealtimeTTS with support for all TTS engines:

pip install -U realtimetts[all]

Custom Installation

RealtimeTTS allows for custom installation with minimal library installations. Here are the options available:

  • all: Full installation with every engine supported.
  • system: Includes system-specific TTS capabilities (e.g., pyttsx3).
  • azure: Adds Azure Cognitive Services Speech support.
  • elevenlabs: Includes integration with ElevenLabs API.
  • openai: For OpenAI voice services.
  • gtts: Google Text-to-Speech support.
  • coqui: Installs the Coqui TTS engine.
  • minimal: Installs only the base requirements with no engine (only needed if you want to develop an own engine)

Say you want to install RealtimeTTS only for local neuronal Coqui TTS usage, then you should use:

pip install realtimetts[coqui]

If for example you want to install RealtimeTTS with only Azure Cognitive Services Speech, ElevenLabs, and OpenAI support:

pip install realtimetts[azure,elevenlabs,openai]

Virtual Environment Installation

For those who want to perform a full installation within a virtual environment, follow these steps:

python -m venv env_realtimetts
env_realtimetts\Scripts\activate.bat
python.exe -m pip install --upgrade pip
pip install -U realtimetts[all]

v0.4.1

01 Jun 17:29
Compare
Choose a tag to compare

v0.4.0

17 May 22:56
Compare
Choose a tag to compare
  • New engine: GTTSEngine

    Offers realtime support for Google Translate's text-to-speech API (GTTS): free TTS without need for a GPU with a better sound quality than system voices can offer

    💖 Shoutout to Pierre Nicolas Durette for making this possible