标题

基本信息

标题: "XTTS: a Massively Multilingual Zero-Shot Text-to-Speech Model"
作者:
- 01 Edresson Casanova,
- 02 Kelly Davis,
- 03 Eren Gölge,
- 04 Görkem Göknar,
- 05 Iulian Gulea,
- 06 Logan Hart,
- 07 Aya Aljafari,
- 08 Joshua Meyer,
- 09 Reuben Morais,
- 10 Samuel Olayemi,
- 11 Julian Weber
链接:
- ArXiv
- Publication
- Github
- Demo
文件:
- ArXiv
- [Publication] #TODO

Abstract: 摘要

Most Zero-shot Multi-speaker TTS (ZS-TTS) systems support only a single language. Although models like YourTTS, VALL-E X, Mega-TTS 2, and Voicebox explored Multilingual ZS-TTS they are limited to just a few high/medium resource languages, limiting the applications of these models in most of the low/medium resource languages. In this paper, we aim to alleviate this issue by proposing and making publicly available the XTTS system. Our method builds upon the Tortoise model and adds several novel modifications to enable multilingual training, improve voice cloning, and enable faster training and inference. XTTS was trained in 16 languages and achieved state-of-the-art (SOTA) results in most of them.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

2024.06.07_XTTS.md

2024.06.07_XTTS.md

标题

Abstract: 摘要

1.Introduction: 引言

2.Related Works: 相关工作

3.Methodology: 方法

4.Experiments: 实验

5.Results: 结果

6.Conclusions: 结论

Files

2024.06.07_XTTS.md

Latest commit

History

2024.06.07_XTTS.md

File metadata and controls

标题

Abstract: 摘要

1.Introduction: 引言

2.Related Works: 相关工作

3.Methodology: 方法

4.Experiments: 实验

5.Results: 结果

6.Conclusions: 结论