Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Whisper stt overhaul js #6194

Merged
merged 61 commits into from
Jul 2, 2024
Merged

Conversation

TimStrauven
Copy link
Contributor

I first created an overhaul of whisper to easily load the model on either GPU or CPU in #5563 a while ago.

Following the discussion and issues presented in #5929 this should be a merged solution for the issue @mamei16 tried to solve initially (including an extra record button next to the default generate button). And it is also based on the solution of @RandomInternetPreson presented in #6189.

The gradio Audio component was abondoned because of the issues above and replaced by a js implementation.

Checklist:

oobabooga and others added 27 commits February 25, 2024 14:29
@oobabooga oobabooga merged commit 8074fba into oobabooga:dev Jul 2, 2024
@oobabooga
Copy link
Owner

oobabooga commented Jul 2, 2024

Amazing work, I have tested it and it works perfectly. The code is clean, the UI works well, just perfect.

Combined with the coqui_tts extension or with @erew123's AllTalk TTS, this should be a fun experience now. A tip is to leave the button active and start/stop recording with Enter.

print

I have added @RandomInternetPreson as a PR coauthor for acknowledgement.

PoetOnTheRun pushed a commit to PoetOnTheRun/text-generation-webui that referenced this pull request Oct 22, 2024
---------

Co-authored-by: RandoInternetPreson <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants