-
Notifications
You must be signed in to change notification settings - Fork 548
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FYI: Download links about Android APKs for piper models #257
Comments
@csukuangfj thanks for your work. To evaluate models it would be great to implement android tts engine api. I would do it myself after i finish all work-related things, but i am saying in case someone could do it before. |
@csukuangfj Sorry for my noob question. When I convert a fine-tuned coqui tts vits-ljs model to onnx using the code here, i only got the onnx export. The android code expects tokens.txt and lexicon. How can i get those too? and will this work with coqui tts models? |
Can you find the code about how to convert a word to phonemes for vits models from coqui? If you can provide that, I can provide scripts to generate lexicon.txt and tokens.txt and also model.onnx from coqui. |
I just managed to convert vits models from coqui to sherpa-onnx. Will post a colab notebook to show you how to do that. You can use the converted model in sherpa-onnx, e.g., build an Android App with sherpa-onnx. |
I just created a colab notebook to show how to export the VITS models from Please see |
For those of you who are interested in converting piper models to sherpa-onnx, please have a look at the following https://colab.research.google.com/drive/1PScLJV3sbUUAOiptLO7Ixlzh9XnWWoYZ?usp=sharing |
@csukuangfj Thanks for sharing how to export. I run your colab notebook without using my model. Everything is generated but when i use it in the sherpa-onnx tts android demo app, it crashes. The app loads the model, lexicon and tokens all right but when you tap "Generate" the app crashes with:
This is the model.onnx, lexicon.txt and token.txt generated from your notebook - https://drive.google.com/file/d/1ndZ5MSyS8482Eht6IP1jqxsMwIvB9Ln6/view?usp=sharing When running your notebook, this was the only error i encountered but i'm not sure that matters: %%shell pip install -q TTS:
This android crash occurs when i export and use my fine-tuned coqui tts model too. |
Could you show more logs, something like below:
I need to see the model meta data output. |
Could you post some logcat output for the crash? |
From you error log,
It crashes in the function
which does not look right. I think there is something wrong in the |
This is my log. Comment field says coqui but when I compare to yours I can see the numofspeakers is 0 for mine:
|
Are you using the latest master of sherpa-onnx or the version >= v1.8.9? |
I am using version 1.8.7. Same version used in the apks you originally shared |
I just tried v1.8.9 and it worked :) |
Glad to hear that it works for you. By the way, we have pre-built Android APKs for the VITS English models from Coqui. |
Thank you very much for your amazing support. It's incredible! Now android demo works with my fine-tuned coqui models too. By the way, in case I want to use Java instead of kotlin for android, would I need to build from source? There seems to be only kotlin support for android at the moment. Also are there plans to release same on-device tts for iOS too? |
Sorry that we only have Kotlin for Android demo. But the jni interface can also be used in Java. You can reuse the You can build sherpa-onnx from source for Android
Yes, it is in the plan. We have already supported speech-to-text on iOS. Adding text-to-speech to iOS is very easy |
@csukuangfj That's great to hear! I have another noob question: I have a fine tuned coqui tts vits model that contains non-English words. I use a custom CMU.in.IPA.txt and custom all-english-words.txt file for the non-English words (with few English words). When I synthesise using the cell that contains this code:
Everything works. The pronunciation is good. However when I use:
I get unexpected results - the pronunciations are wrong. Even for the few English words. This is a sample of my ipa.txt file which contains both non-english and few English words:
all-english-words.txt contains all the words. I followed the same format used in the original English list and I used a phoneme to IPA converter. What do I need to do to make this work? Thank you |
Please add your words to The pronunciations in Please don't use an IPA converter to generate pronunciations for your new words. The code uses |
@csukuangfj you are right! It's better now, though there is still some difference ( a slight loss in pronunciation compared to the original coqui model). |
Hi @csukuangfj, |
We try to support as many models as possible, including models not from piper, which may not use piper_phonemize.
We can add OOV words to the lexicon.txt manually. By the way, the lexicon.txt already covers lots of words. |
The order does not matter as long as you don't add duplicate words. If you add duplicate words, then the first one has a higher priority and later ones are ignored.
I realized that a word that appears in a sentence with other words can have a different pronunciation from when it appears standalone. I am afraid it is hard, if not impossible, to improve the pronunciations with the current approach. |
I am not sure you can cover everything.
It might be better to make a condition and use piper_phonemize for piper models.
Yes, this will result in adding espeak-ng, but it will be much better
than adding words manually
…On 11/14/23, Fangjun Kuang ***@***.***> wrote:
> Why not make an Android port of piper_phonemize and use it in next gen TTS
> instead of a lexicon?
We try to support as many models as possible, including models not from
piper, which may not use piper_phonemize.
---
> there will be many words will try to read that may not be in that lexicon
We can add OOV words to the lexicon.txt manually. By the way, the
lexicon.txt already covers lots of words.
--
Reply to this email directly or view it on GitHub:
#257 (comment)
You are receiving this because you commented.
Message ID: ***@***.***>
--
with best regards Beqa Gozalishvili
Tell: +995593454005
Email: ***@***.***
Web: https://gozaltech.org
Skype: beqabeqa473
Telegram: https://t.me/gozaltech
facebook: https://facebook.com/gozaltech
twitter: https://twitter.com/beqabeqa473
Instagram: https://instagram.com/beqa.gozalishvili
|
Maybe a better approach is to change the modeling unit from phones to other units that can be entirely derived from words. |
@csukuangfj true, single words seem to have poor pronunciations compared to same words in phrases. However, the fact that this tts solution works offline is amazing. By the way can inference be done on the GPU? And importantly, has the iOS version been released? |
@csukuangfj wow can't wait :) your work is amazing |
The iOS demo is ready now. You can run text-to-speech with Next-gen Kaldi on iOS using I have recorded a video showing how to use it. Please see
Don't worry. I will try to fix it. |
@csukuangfj incredible. that was quick! The video demo is fantastic. Great Job! but for me seems I'd have to wait since my project does not use swiftUI. It only uses uikit. I can see you have this to help us build the android from source. Can you add documentation for iOS too? so we can build our iOS too from source |
Thank you |
Is it possible to add my custom model? Here is the link: If you want to check it out, it can be unpacked with:
Ive been wanting to use this voice but the original application is 32bit and nearly 15 years old at this point. Its being held together by threads. |
is it a vits model? could you show the extracted files? |
I just looked at the model and found that it is piper-based, so it is definitely supported by sherpa-onnx. I have converted the model. You can find it at I have also added it to For the following text:
It generates the following audio: cee2f734-4dcb-4494-bd29-27e0c7786c3a.mov |
By the way, you can also find the exported model at The above repo also contains the script for exporting. |
The audio sounds like British English, but the JSON config file says the voice is |
Thank you so much! Thats my mistake, more than likely. I believe I used en_US-amy as a base to fine-tune off of because it sounded better. So I was unsure if it should be marked as en_US or en_GB. Its definitely a British voice. |
I am changing it to |
I appreciate it. |
FYI: No lexicon.txt is required any longer. We are also using piper-phonemize in sherpa-onnx. You can find all models from piper in sherpa-onnx at |
Im very new to working with android let alone with tts apis. Would it be possible to lead me in the right direction to trying to develope a android tts engine api? I have found some examples of offline-tts though im trying to figure how convert this into a engine api. I was thinking some of these would be useful ( TTS service wrapper , TTS-engine ), I think i just need some guidance as I am lost in a sea of knowledge. Let me know what you think. |
Please wait for a moment. I have managed to create a tts engine service that you can use to replace the system tts engine. I will create a PR soon. |
|
I am also working on that.
I have already a working prototype of piper tts engine for android
with downloading and installing voices.
I will make repo public soon
…On 12/31/23, ekul_ ***@***.***> wrote:
> I will create a PR soon.
@csukuangfj
Woah that is legendary. I'd love to explore what you did to do this, I'm
interested in learning. I'll hold tight. Thanks
--
Reply to this email directly or view it on GitHub:
#257 (comment)
You are receiving this because you were mentioned.
Message ID: ***@***.***>
--
with best regards Beqa Gozalishvili
Tell: +995593454005
Email: ***@***.***
Web: https://gozaltech.org
Skype: beqabeqa473
Telegram: https://t.me/gozaltech
facebook: https://facebook.com/gozaltech
twitter: https://twitter.com/beqabeqa473
Instagram: https://instagram.com/beqa.gozalishvili
|
PR is short for pull request. I just created one at k2-fsa/sherpa-onnx#508 You can find a YouTube video at |
By the way, you can download pre-built text-to-speech engine APKs at |
@csukuangfj Awesome repo Fangjun Kuang ! do you know if anybody is working on a dart package for next gen Kaldi to be used on android and iOS ? |
Yes, please see |
thanks ! |
@Web3Kev Yes, we have now. Please see We have flutter TTS demo for Piper. Please see https://github.com/k2-fsa/sherpa-onnx/tree/master/flutter-examples/tts |
@csukuangfj @beqabeqa473 or Anyone (Help Please) I am not advanced enough to get into the coding of this model etc... I am just trying to get it to work on Android for use in my Accessibility TTS readers. I see there were created apks https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html . When I install on my Google Pixel it only allows one to be used at a time (no input or way to change to a different voice's apk under the app)? It installs the new voice over the same APK from the download. Is there a easier way to do this with Fdroid app where you can just select the voice you want off the onnx voices? Thanks, |
Now you can try piper models on your Android phones.
The following languages are supported:
Community help is appreciated to convert more models from piper to sherpa-onnx.
Please see
https://k2-fsa.github.io/sherpa/onnx/tts/apk.html
Note:
You can try the models in your browser by visiting the following huggingface space
https://huggingface.co/spaces/k2-fsa/text-to-speech
The text was updated successfully, but these errors were encountered: