Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature request] #4062

Open
kunge98 opened this issue Nov 27, 2024 · 1 comment
Open

[Feature request] #4062

kunge98 opened this issue Nov 27, 2024 · 1 comment
Labels
feature request feature requests for making TTS better.

Comments

@kunge98
Copy link

kunge98 commented Nov 27, 2024

8503b77be5d18a6ca7ac9990519a105
My code is shown in the figure, using the model of Chinese speech synthesis, my synthesis test text is "你好", when I open the output audio, I find that the audio has 4 seconds, but the two words "你好" obviously do not need that long time, and the synthesized audio is followed by some sounds similar to howling, may I ask what I need to do? Do you need to modify the config.json file?

@kunge98 kunge98 added the feature request feature requests for making TTS better. label Nov 27, 2024
@kunge98
Copy link
Author

kunge98 commented Nov 27, 2024

Then I tried to modify the max_decoder_steps value in the config.json file, and the output voice confirmation could be shortened. But right now I want to do speech generation for bulk text, but the length of this article is different. How can I do that?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request feature requests for making TTS better.
Projects
None yet
Development

No branches or pull requests

1 participant