Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document how to build speech synthesis system for new languages #18

Open
r9y9 opened this issue Aug 17, 2017 · 11 comments
Open

Document how to build speech synthesis system for new languages #18

r9y9 opened this issue Aug 17, 2017 · 11 comments
Labels

Comments

@r9y9
Copy link
Owner

r9y9 commented Aug 17, 2017

All you need is that

  • Wav files
  • Full-context labels
  • HTS-style question file

With those all prepared, it should be very straightforward to implement.

@r9y9 r9y9 added the doc label Aug 17, 2017
@ruohoruotsi
Copy link

Hi thank you for this work.
About this issue: Is the above recipe complete in the case of tonal languages? esp. those with rising and falling tones/pitch on vowels, nasal consonants?

@r9y9
Copy link
Owner Author

r9y9 commented Dec 29, 2017

Yes. I think you will need to annotate the accent information (rising/failing tones) as the HTS-style label.

@attitudechunfeng
Copy link

Hi, 山本さん。I'm trying to synthesize mandarin using your tool. To my knowledge, i need to do forced alignment manually in previous. And then writing a frontend that adapts the language to extract linguistic features. So does that mean i only need to replace the frontend part? And could i using other forced alignment tools such as "montreal" at other alignment level which is neither 'state' nor 'phone', for example, 'syllable'?

@r9y9
Copy link
Owner Author

r9y9 commented Jan 3, 2018

こんにちは、 @attitudechunfeng !

So does that mean i only need to replace the frontend part?

Yes, you can reuse other parts. You can also reuse a part of frontend (https://r9y9.github.io/nnmnkwii/latest/references/frontend.html#frontend) to convert your linguistic features to its numeric representation at either phone, state or frame-level if you use the HTS-style label format.

And could i using other forced alignment tools such as "montreal" at other alignment level which is neither 'state' nor 'phone', for example, 'syllable'?

You could, but then you cannot reuse https://r9y9.github.io/nnmnkwii/latest/references/frontend.html#frontend, since it assumes state or phone-level alignment.

@attitudechunfeng
Copy link

本当にありがとう!I'll try it.

@r9y9
Copy link
Owner Author

r9y9 commented Jan 3, 2018

Alternatively, you could consider end-to-end approach, which doesn't require alignment as well as linguistic feature extraction (the hard part of the TTS!). See https://github.com/r9y9/deepvoice3_pytorch if you are interested.

@attitudechunfeng
Copy link

Thank u. In fact, i'm also following your other excellent tts projects. However, i'm now trying do some work about offline usage, end-to-end models are not convenient to be transferred to mobile devices and its speed on cpu is also can't be guaranteed. So i have to use traditional method.

@r9y9
Copy link
Owner Author

r9y9 commented Jan 3, 2018

I see. I hope you find something useful. Let me know if you find something should be improved.

@attitudechunfeng
Copy link

okay, if there're something interesting, i'll report it.

@stale
Copy link

stale bot commented May 30, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix label May 30, 2019
@r9y9 r9y9 removed the wontfix label May 30, 2019
@HarmanGhawaddi
Copy link

All you need is that

  • Wav files
  • Full-context labels
  • HTS-style question file

With those all prepared, it should be very straightforward to implement.

I have wav files of punjabi language. Please guide me to generate full context labels and HTS-style question file

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants