-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Zero-filled WAV give hallucination and wrong duration #1881
Comments
This seems to be dependent on the language, I see a similar effect with -l fi and several others. Multilingual is a bit tricky anyways, because once you set the language you can't change it (as discussed in #1800). |
I reached same conclusion about Urdu, model is limited and is not very good for low resource languages, and can't handle silence for Urdu, and I could not find any VAD model that did well with Urdu non speech either. So, I'm stuck with high WER. |
also having some weird sentences coming out of nowhere, russian lang found this list of hallucination as well https://gist.github.com/waveletdeboshir/8bf52f04bf78018194f25b2390c08309 |
I try process WAV file with zeroes in Data section. File duration is 1,2 seconds (attached it).
Whisper.cpp give hallucination (and wrong duration).
zeroes.zip
I check it on last master branch:
I think, this is a bug.
The text was updated successfully, but these errors were encountered: