Replies: 1 comment
-
@neouyghur you are totally right! Actually the publicly available open Dataset of the Uyghur language from CommonVoice https://commonvoice.mozilla.org/ug/datasets (dateded: 14.09.2023 and since then added a lot) should be enough to be listed and added to the OpenAI Whisper Tokenizer. This Uyghur dateset is already being used by Speechmatics https://speechmatics.com/ and there is also a demo Video by an Uyghur computer YouTube channel https://www.youtube.com/watch?v=JnxOuaJINwM (Speech to Text is demonstrated at 2:07 of the video). |
Beta Was this translation helpful? Give feedback.
-
Hello, I'm interested in knowing whether there are plans to include Uyghur in the list of supported languages. Additionally, I am curious about the reason behind the absence of a tokenizer for Uyghur. Thank you!
Beta Was this translation helpful? Give feedback.
All reactions