Audio clips off during the final word #487

phayke · 2025-01-10T08:42:51Z

🔴 If you have installed AllTalk in a custom Python environment, I will only be able to provide limited assistance/support. AllTalk draws on a variety of scripts and libraries that are not written or managed by myself, and they may fail, error or give strange results in custom built python environments.

🔴 Please generate a diagnostics report and upload the "diagnostics.log" as this helps me understand your configuration.

https://github.com/erew123/alltalk_tts/tree/main?#-how-to-make-a-diagnostics-report-file

Describe the bug
A clear and concise description of what the bug is.
Audio produced cuts off halfway thru the final word consistently.
To Reproduce
Steps to reproduce the behaviour:
Generate speech
Screenshots
If applicable, add screenshots to help explain your problem.

Text/logs
If applicable, copy/paste in your logs here from the console.

Desktop (please complete the following information):
AllTalk was updated: [approx. date]
Custom Python environment: [yes/no give details if yes]
Text-generation-webUI was updated: [approx. date]

Additional context
Add any other context about the problem here.

EPCarter · 2025-01-10T20:23:36Z

Describe the bug:
In XTTS Models Finetuning Dataset .wav files, the last word is clipped most of the time.

Steps to reproduce the behaviour:
Generate Dataset using XTTS Models Finetuning

Screenshots:

Text/logs:
finetune.log
diagnostics.log

Desktop (please complete the following information):
AllTalk was updated: Jan 5, 2025
Custom Python environment: No
Text-generation-webUI was updated: Jan 5, 2025

Additional context:
#477 (comment)
Audio Samples.zip
Source Audio Sample.zip

erew123 · 2025-01-13T18:04:26Z

Hi @EPCarter So this is dataset generation for finetuning? @Yohrog has been looking at this and submitting a PR but Ive not heard back from them for a while now. #419

But listening to your audio, you are saying it clipping early on the end of a word, correct?

EPCarter · 2025-01-13T20:24:09Z

That is correct. The Source Sample is what I start with and input to the fine-tuning, and the Audio Samples are what come out when fine-tuning chops things into .wav files for the dataset generation. Nearly all the .wav files are clipped on the last word, regardless of the length.

unifirer · 2025-01-24T02:13:05Z

@erew123 i confirm the same thing is happening to me, not consistently but about 20% of the times, first or final words in a sentence is partially spoken, happens on both xttsv2_2.0.2 and xttsv2_2.0.3 using zero shot with 10s cleaned audio clips using tts generator.

this has been a persistent problem from alltalk v1. i thought its a problem with xtts that you cant fix, different makers right? but ill post this anyway

thank you very much for your hardwork, im enjoying alltalk every much in our chaotic world.

dont overwork yourself, i hope ur family gets better, youre helping these strangers posting here so much and your documentation is great, im impressed at your dedication to a free project.

ure a saint, i dont believe in god but i believe in good people like you, better than saving lives, youre improving mine so much, i would never use a command line for this haha, the world has moved on to GUIs

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Audio clips off during the final word #487

Audio clips off during the final word #487

phayke commented Jan 10, 2025

EPCarter commented Jan 10, 2025 •

edited

Loading

erew123 commented Jan 13, 2025

EPCarter commented Jan 13, 2025

unifirer commented Jan 24, 2025

Audio clips off during the final word #487

Audio clips off during the final word #487

Comments

phayke commented Jan 10, 2025

EPCarter commented Jan 10, 2025 • edited Loading

erew123 commented Jan 13, 2025

EPCarter commented Jan 13, 2025

unifirer commented Jan 24, 2025

EPCarter commented Jan 10, 2025 •

edited

Loading