Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

控制生成音频的静音长度 #346

Open
a122760 opened this issue Jan 3, 2025 · 3 comments
Open

控制生成音频的静音长度 #346

a122760 opened this issue Jan 3, 2025 · 3 comments
Labels
question Further information is requested

Comments

@a122760
Copy link

a122760 commented Jan 3, 2025

如何控制生成音频的静音长度?有些音频的静音特别长,甚至占比50%,如何控制生成音频的静音长度呢?尤其是开头和结尾。

Copy link

dosubot bot commented Jan 3, 2025

Hey @a122760, I'm here to help you with any bugs, questions, or contributions. Let's tackle this issue together!

在 OpenVoice 代码库中,控制生成音频静音长度的参数可以在 split_audio_vad 函数中找到。具体来说,min_silence_duration 参数设置为 1 秒:

segments = get_vad_segments(
    audio_vad,
    output_sample=True,
    min_speech_duration=0.1,
    min_silence_duration=1,
    method="silero",
)

这个参数决定了将一个片段视为单独音频片段所需的最小静音时长。通过调整 min_silence_duration 的值,你可以控制生成音频中的静音长度,尤其是在开头和结尾部分[1]

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other

@dosubot dosubot bot added the question Further information is requested label Jan 3, 2025
@a122760
Copy link
Author

a122760 commented Jan 3, 2025

Hey @a122760, I'm here to help you with any bugs, questions, or contributions. Let's tackle this issue together!

在 OpenVoice 代码库中,控制生成音频静音长度的参数可以在 split_audio_vad 函数中找到。具体来说,min_silence_duration 参数设置为 1 秒:

segments = get_vad_segments(
    audio_vad,
    output_sample=True,
    min_speech_duration=0.1,
    min_silence_duration=1,
    method="silero",
)

这个参数决定了将一个片段视为单独音频片段所需的最小静音时长。通过调整 min_silence_duration 的值,你可以控制生成音频中的静音长度,尤其是在开头和结尾部分[1]

To continue talking to Dosu, mention @dosu.

Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other

No, this should control the silence duration during VAD segmentation, not the parameter for controlling the silence duration in TTS synthesized audio. Is there a way to control the silence duration in the synthesized audio?

Copy link

dosubot bot commented Jan 3, 2025

To control the silence duration in TTS synthesized audio in the OpenVoice codebase, you can modify the audio_numpy_concat method in the BaseSpeakerTTS class. The silence duration is controlled by the following line:

audio_segments += [0] * int((sr * 0.05)/speed)

This line adds a silence of 0.05 seconds (or 50 milliseconds) between audio segments. You can adjust the silence duration by changing the value 0.05 to your desired duration in seconds [1].

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

1 participant