Skip to content

Getting inconsistent outputs on passing multiple audio chunks parallelly #358

Answered by snakers4
thakurudit asked this question in Q&A
Discussion options

You must be logged in to vote

Hi,

Looks like the explanation is simple in this case.
Each instance should have its own instance of VAD, since the VAD is not stateless.
Hence it has a reset_states() method.

The VAD, when detecting speech in a streaming fashion, keeps its internal state.
There are ways to invoke the VAD in a batched fashion while keeping the state, but judging by the user feedback, we discourage such use case for its sheer complexity and errors it causes.

A workaround for threads / sockets / workers may be as follows. You can store it, and pass it back to a unified worker, like we basically do in an ONNX example.

Replies: 2 comments 5 replies

Comment options

You must be logged in to vote
5 replies
@thakurudit
Comment options

@snakers4
Comment options

@thakurudit
Comment options

@snakers4
Comment options

@anthonycortinovis
Comment options

Answer selected by snakers4
Comment options

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
help wanted Extra attention is needed
3 participants
Converted from issue

This discussion was converted from issue #357 on July 24, 2023 10:00.