-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Streaming Output Repetition #1702
Comments
+1 Experiencing the same thing I wrote a silly push to talk thing for the cpp stream prototype, was seeing this [00:00:00.000 --> 00:00:02.000] Testing 1, 2, 3 whisper_full_with_state: input is too short - 690 ms < 1000 ms. consider padding the input audio with silence whisper_full_with_state: input is too short - 690 ms < 1000 ms. consider padding the input audio with silence |
I would like to use whisper.cpp to take real-time audio and relay the transcript of the audio back to a user. I am using a Mac m1.
Steps to reproduce:
git clone https://github.com/ggerganov/whisper.cpp.git
cd whisper.cpp
bash ./models/download-ggml-model.sh base.en
make base.en
brew install sdl2
make stream
To start the live transcription: ./stream -m ./models/ggml-base.en.bin --file output.txt
Here is the output when I play this video in the background:
[Start speaking]
Understand. It's difficult to overstate how important be
being mission driven is. So I want to emphasize it one last time. Derivative companies, companies that copy and existing ideas.
idea with very few new insights. Don't excite people and they don't compel the teams to work hard enough to be successful.
Paul Graham is going to talk about how to get startup ideas next week. It's something that a lot of founders struggle with.
with, but it's something I believe you can get better with it. Better out of practice. And it's definitely worth trying to get better at.
The hardest part about coming up with great ideas is that the best ideas often look terrible at the beginning.
The 13th search engine and without all the features of a web portal, most people thought that was pointless.
was done and anyway it didn't matter that much. Portals were where the value was at. The tenth social network and limited
only to college students with no money, also terrible. My space had won and who wants college students as customers. Or a way to stand is true.
strangers and couches. That just sounds terrible all around. These all sounded really bad, but they turned out to be good.
If they had sounded really good, there would have been too many people working on them.
Here is what is in output.txt:
Understand.
Understand. It's difficult to overstate how important be
being mission driven is, so I want to emphasize it one last time.
being mission driven is. So I want to emphasize it one last time. Derivative companies, companies that copy and existing ideas.
idea with very few new insights.
idea with very few new insights. Don't excite people and they don't compel the teams to work hard enough to be successful.
Paul Graham is going to talk about how to get started.
Paul Graham is going to talk about how to get startup ideas next week. It's something that a lot of founders struggle with.
with, but it's something I believe you can get better with it. Better out.
with, but it's something I believe you can get better with it. Better out of practice. And it's definitely worth trying to get better at.
The hardest part about coming up with great ideas.
The hardest part about coming up with great ideas is that the best ideas often look terrible at the beginning.
The 13th search engine and without all the features of a webinar.
The 13th search engine and without all the features of a web portal, most people thought that was pointless.
was done and anyway it didn't matter that much.
was done and anyway it didn't matter that much. Portals were where the value was at. The tenth social network and limited
only to college students with no money. Also terrible. MySpace@w
only to college students with no money, also terrible. My space had won and who wants college students as customers. Or a way to stand is true.
and strangers, couches. That just sounds terrible all around.
strangers and couches. That just sounds terrible all around. These all sounded really bad, but they turned out to be good.
If they had sounded really good, there would have been too many people.
If they had sounded really good, there would have been too many people working on them.
As you can see, there is a lot of repetition in the output file, which I think is ok for displaying live feedback, but I'm not sure how to go about deciding what should be the 'final' transcript. Any ideas?
The text was updated successfully, but these errors were encountered: