YorùbáVoice

Landing page for data, code and publications for this project sponsored by an Imminent Research Grant.

In 2022, we launched the curation and recording of 40 hours of high-fidelity speech data for the Yorùbá language, the third most widely spoken language in Africa with over 40 million L1 speakers. We partner with the YorubaName organization in Nigeria to encourage volunteers both online and offline to record their voices.

Official project blog → www.yorubavoice.com
The dataset is published in the ELRA catalogue →
- ELRA Resource description page
- 012-405-700-001-6 → Corresponding unique ISLRN number to use in citations, publications
The LREC-COLING 2024 paper → arXiv
The Speech Recorder App we developed → yoruba-voice-speech-recorder
Source code and various tools used can be found in this present repo

BibTeX entry and citation info

If you make use of our dataset, please cite the our paper.

@misc{ogunremi2023iroyinspeech,
      title={\`{I}r\`{o}y\`{i}nSpeech: A multi-purpose Yor\`{u}b\'{a} Speech Corpus}, 
      author={Tolulope Ogunremi and Kola Tubosun and Anuoluwapo Aremu and Iroro Orife and David Ifeoluwa Adelani},
      year={2023},
      eprint={2307.16071},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
metadata		metadata
src/g2p		src/g2p
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

YorùbáVoice

BibTeX entry and citation info

About

Releases

Packages

Contributors 2

Languages

License

Niger-Volta-LTI/yoruba-voice

Folders and files

Latest commit

History

Repository files navigation

YorùbáVoice

BibTeX entry and citation info

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages