-
Notifications
You must be signed in to change notification settings - Fork 270
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
VAD and whisper-timestamped #30
Comments
Hi, thanks for feedback. whisper_streaming/whisper_online.py Line 136 in 23c2d56
SILERO vs AUDITOK is a topic for another issue. I don't have feedback. |
but I realized that VAD is now used ineffectively. In every update it's processed on the whole buffer. It could be used to cut silence out of the buffer, so that next update is faster. This could be improved |
@Jeronymous , please open an issue about this, if you'll have a test results to share |
First, thank you. I am super happy to see whisper-timestamped used in such a good project.
Having Whipser streamed in real time is a super feature!
I see here that VAD is not available when using whisper-timestamped backend:
whisper_streaming/whisper_online.py
Lines 79 to 80 in 23c2d56
But VAD IS implemented in whisper-timestamped (it was even before faster-whisper integrated it). It's currently based on SILERO (same as what was done in faster-whisper).
Am I missing a sticking point? (Maybe the fact that things required for VAD are not by default in the requirements?)
I can contribute if help is needed on this.
(VAD is important to prevent some hallucinations of Whisper models, and make timestamps more accurate)
Also, I want to mention:
After being disappointed with weird results on some files, I opened a branch to replace SILERO with AUDITOK : linto-ai/whisper-timestamped#78 (see the linked issue to have an illustration of possible "hallucinations" of Silero).
I had good experience with Auditok. I was hoping some user feedback to confirm before merging in master. But as it's not coming, maybe we just need to establish a benchmark to confirm the improvement.
The text was updated successfully, but these errors were encountered: