-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TTS re-write #72
TTS re-write #72
Conversation
Co-authored-by: David de la Iglesia Castro <[email protected]>
@Kostis-S-Z Did some testing on this, summarized below:
|
Thanks for testing @stefanfrench !
Good that you documented this! However I didnt manage to reproduce it...
💯
Yeah, its Olmo not being able to generate hindi script that is the limitation here. I would still keep the model in case someone else experiments with another LLM that does work with Hindi though.
Right, you had to use the complete model id: |
What's changing
The goal of this PR is to rewrite the TTS component to enable easily adding support for more models.
In the process, the following changes / additions were made:
TTSModel
TTS_LOADERS
TTS_INFERENCE
config.yaml
'stext_to_speech_model
withvalidate_text_to_speech_model
test_load_tts_model
to use parametrizationparler
andbark
modelsCloses #29
How to test it
Steps to test the changes:
git clone https://github.com/Kostis-S-Z/document-to-podcast.git
cd document-to-podcast
git checkout multilingual-support
pip install -e .
Model IDs / Languages tested:
)example_data/config.yaml
(or create a copy) and changeinput_file
: Use a file that has text in a language of your choicetext_to_speech_model
: Use one of the model ids, defined below, based on the language you are testingtext_to_text_prompt
: Re-write it / Translate it in the testing languagespeaker/description
: Re-write it / Translate it in the testing languagevoice_profile
: Use one of the pre-defined profiles based on the testing language, from the list belowdocument-to-podcast --from_config example_data/config.yaml
podcast.txt
andpodcast.wav
Model IDs / Languages tested:
parler-tts/parler-tts-mini-multilingual-v1.1
Sophia
&Nicholas
)Mark
&Jessica
)Daniel
&Christine
)Nicole
&Michelle
)Julia
&Richard
)Alex
&Natalie
)Steven
&Olivia
)ai4bharat/indic-parler-tts
Rohit
&Divya
)Prakash
&Lalitha
)suno/bark
v2/es_speaker_0
&v2/es_speaker_8
)OuteTTS-0.2-500M
female_1
&male_1
)Additional notes for reviewers
Its expected that some languages will work better than others. Its also a common issue that the voice pattern might not be consistent across the speaker rounds (maybe Speaker 1 at first sound in one way, and then their voice might change)
Full list of languages supported:
OuteTTS-0.2-500M
suno/bark
ai4bharat/indic-parler-tts
parler-tts/parler-tts-mini-multilingual-v1.1
I already...
/docs
)