TTS #27

Abhrant · 2024-10-28T10:39:41Z

This PR adds a TTS engine for e2e voice assistant on Pi.

TTS engine functions in the following way :

A separate thread having the tts process is called before the LLM :

  tts_processing_thread = threading.Thread(
                target=create_tts_wav,
                args=(stop_event, tts_processing_queue)
            )
  tts_processing_thread.start()

A queue of word outputs form the LLM is maintained and once a delimiter ['.', '!', '?' , ":" , ";"] is reached, queue is emptied and the sentence is sent to the STT engine.
STT used : https://github.com/rhasspy/piper

… audio via ssh on local system.

Arnav0400

Can we clean up the code (print statements, formatting, etc) and also have e2e benchmarks for this? Additionally can we also modify the config file to have audio out as a flag which triggers piper if set to true.

Arnav0400 · 2024-10-28T14:36:32Z

examples/experimentals/voice_engine/main.py


 LOGGER = None
-DEFAULT_CONFIG = "recipe/default.yaml"
+DEFAULT_CONFIG = "/home/piuser/voice/nyuntam/examples/experimentals/voice_engine/recipe/rpi5.yaml"


can we have a relative path?

Arnav0400 · 2024-10-28T14:37:20Z

examples/experimentals/voice_engine/main.py

+def create_tts_wav(
+    stop_event: threading.Event,
+    tts_processing_queue: queue.Queue,
+    output_dir: str = "/home/piuser/voice/core/test-output"


lets have a relative path as default

… the audio file and then delete it and continue

…workflow on that path

This reverts commit c0df6d0.

Abhrant added 2 commits October 23, 2024 18:59

✨feat : async tts using piper, 20 tokens per phrase, ffmpeg to stream…

4fd96ec

… audio via ssh on local system.

✨feat: send sentences to TTS based on delimiters

c87b592

Abhrant added the enhancement New feature or request label Oct 28, 2024

Abhrant requested a review from Arnav0400 October 28, 2024 10:39

Abhrant self-assigned this Oct 28, 2024

Arnav0400 requested changes Oct 28, 2024

View reviewed changes

Abhrant and others added 22 commits November 1, 2024 18:35

✨feat: voice as a config param from yaml, thread monitoring for tts

158daad

update: add tts params to config.yaml

e71d989

📝time compare plots

b88ede8

🐛bugfix: last sentence not being sent to tts bug fix

7c9bb92

✨feat: run e2e on android

cc7e416

📝android recipe and running instrctions

6fe934e

📝docs: detailed instructions to run on android

79b8c6f

Update run_on_android.md

014ac85

✨feat: continously look for audio file and run the code once it finds…

aa9ec8c

… the audio file and then delete it and continue

✨feat: receive audio via wifi and save it, then run the wisper-llama …

b32cc8c

…workflow on that path

📝update: llama -> qwen model path update

da49521

updated paths

399dac3

fixed audio saving

ad5e93a

receive_audio updated to take save output dir and return the audio bytes

fb8ee80

fixed blocking calls

e09842d

fixed path for audio dump

c0df6d0

fixed speed of audio

eaa9b4b

Revert "fixed path for audio dump"

787fd8a

This reverts commit c0df6d0.

🔥remove: reduce delimiter list to only coordinating conjunctions

cda12b7

added device comp and remmoed processing

b576cda

✨update: adding qwen prompt to default prompt

a7f7d5e

✨feat: stopping LLM when STT sends empty string

694eb98

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TTS #27

TTS #27

Abhrant commented Oct 28, 2024

Arnav0400 left a comment

Arnav0400 Oct 28, 2024

Arnav0400 Oct 28, 2024

TTS #27

Are you sure you want to change the base?

TTS #27

Conversation

Abhrant commented Oct 28, 2024

Arnav0400 left a comment

Choose a reason for hiding this comment

Arnav0400 Oct 28, 2024

Choose a reason for hiding this comment

Arnav0400 Oct 28, 2024

Choose a reason for hiding this comment