-
❓ Questions and HelpHey! I've been really loving how accurate the VADIterator is for 8KHz telephony audio. Currently we have a static threshold of 0.85. This works well in most cases, but not in situations where (1) there is a large amount of background noise (2) there is a quiet environment but the speaker is also very quiet. Do you have any suggestions for how to implement relative thresholds to deal with these cases? Thanks again! |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 2 replies
-
Hi,
One approach may be to fiddle with VAD activation and deactivation thresholds separately.
The VAD normalizes audio internally, so in a sense both of these cases are very similar. |
Beta Was this translation helpful? Give feedback.
Hi,
One approach may be to fiddle with VAD activation and deactivation thresholds separately.
Another approach may be to actually have some accumulation buffer that postpones the VAD activation, but if activated, it activates from a previous early point.
The VAD normalizes audio internally, so in a sense both of these cases are very similar.
As for the quiet audio, if all audio is quiet, you can try re-normalizing it and / or having some naive median rejection algorithm, where there is some median energy threshold where every…