Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Live Transcription speed greatly affected when adding new Client via web browser #271

Open
sonclark opened this issue Aug 30, 2024 · 4 comments

Comments

@sonclark
Copy link

I am creating a live transcription webpage that connect directly to the WhisperLive server via websocket. For a single client, the performance is great (less than 1 second). When i add another client (open the webpage in another browser), the transcription speed greatly decreased (up to 30 seconds).

I have test some setups to figure out the possible issue.
Setup 1:

  • 2 WhisperLive clients from the demo to connect to the same WhisperLive Server. Minimal effect on the performance, both clients getting near instant transcription

Setup 2:

  • 1 WhisperLive client from the demo to connect to the WhisperLive Server. Performance is great, near instant processing time.
  • Add new client from the webpage, initialize new client from the server. No data is sent by the webpage client yet. Observe the performance from WhisperLive client, greatly affected as transcription is getting back in around 30 seconds.
  • Disconnect the webpage client. Observe WhisperLive client performance, slowly getting faster and back to the original speed.

Adding more log into the server and I noticed the process that get slowed down is under transcriber feature extractor.

I want to understand more about how adding client in this manner could affect the performance so much.

@makaveli10
Copy link
Collaborator

makaveli10 commented Aug 30, 2024

Yes, that is expected because we initialize a new model for every new client so, batching would certainly help. But 30 seconds is something that I havent seen even with 4 clients connected simultaneously. Could be the GPU, which GPU are you running the server?

@sonclark
Copy link
Author

@makaveli10 I am running on a RTX 4060. When I try to run 3-4 clients locally using the TranscriptionClient class, it does not seem to have that much of a latency. It is especially bad when I connect via browser (the latency is observed right after the server responses with the ready status). I am still testing out different set up regarding this.

@sonclark
Copy link
Author

sonclark commented Sep 11, 2024

@makaveli10 after checking a few combination, it does not seem to cause by connection via browser. It seems that if I initiate a new client (using the class TranscriptionClient) without running it, it will affects the existing running clients. When I connect directly to the WhisperLiveServer (not through TranscriptionClient), and let it sit without sending any data, I observed the same behavior.

I have looked into the source code but still cannot understand how that could be the case. Hope you can give me some insight into this.

@BasselAshi
Copy link

Have you tried setting single_model to True?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

3 participants