-
Notifications
You must be signed in to change notification settings - Fork 130
Speech engines
Many of the original text to speech engines are no longer maintained as the technology has moved almost completely to cloud implementations. Instructions provided here are for guidance only as few people have installed them recently. If you find any errors, please let us know.
- See here for notes on Microsoft voice
- See here for sample Microsoft Mary
mh.ini
parameters. - See here for an introduction to the Microsoft speech API.
The simplest and fastest TTS engine is flite, available from here . After compiling, point to where the flite binary is with the voice_text_file
and voice_text = flite
parameters in mh.ini
. See here for how to install flite on Raspberry Pi.
The Festival Speech engine is available from here . There are various voices and languages available. Instructions for how to install Festival on Ubuntu and Raspberry pi can be found here. Compiling Festival can be a little tricky, so if you can, you will probably want to use the RPM files.
Once you have downloaded in and installed or compiled Festival, you can test it with the following commands:
echo 'Hello from Festival' | ./festival --tts
./festival --tts ../examples/example.sable
./festival --server &
echo '(SayText "Hello from the festival client")' | ./festival_client
You can also run the client, or a simple telnet, from a different box, but you first have to create a /usr/lib/festival/lib/siteinit.scm
file with a list of boxes that you want to give authority to. (e.g. (set! server_access_list '("localhost" "house\\.isl\\.net"))
). See the festival documentation for more details.
Once you have the festival server running, you can enable MisterHouse to use it with mh.ini
parameters voice_text
and festival_port
.
To set the default voices, find your siteinit.scm
file and have fun with the following:
(set! voice_default 'voice_us1_mbrola)
(set! voice_male1 'voice_kal_diphone)
(set! voice_male2 'voice_us2_mbrola)
(set! voice_female 'voice_kal_diphone)
This depends upon which voices you have installed on your system. Some voices are don_diphone
, kal_diphone
, ked_diphone
, rab_diphone
, us1_mbrola
, us2_mbrola
, and us3_mbrola
.
AT&T Natural Voices is now distributed by wizzardsoftware If you have the Linux binary, use the voice_text_naturalvoice
parameter to point to where you have it installed and set voice_text=naturalvoice
.
If you only have the Windows binary, you can now use Wine to run it from Linux. On a 1.2 GHz Celeron, time-to-speech is about 1 second, -vs- about .4 seconds for the native Linux binary. See bin/mh.ini
for examples on these parameters:
voice_text=NaturalVoiceWine
voice_text_naturalvoice=path_to_windows_voices
wine_path=path_to_wine
- MBROLA has a very nice US english male and female voice.
- Festival voices
- Cepstral. see this Cepstral install guide.
- IBM ViaVoice is no longer available, however instructions can be found here for legacy implementations.
Ricky Buchanan reports ESD and festival --server does not work, and suggests to instead edit /etc/festival.scm
and add these lines to the top:
(Parameter.set 'Audio_Method 'Audio_Command)
(Parameter.set 'Audio_Command "/usr/bin/esdcat -m -r $SR $FILE")
Making sure that /usr/bin/esdcat
points to the right spot for the esdcat
program.