Skip to content

Is there any function to detect speech and silence in Silero-Vad? #201

Answered by snakers4
harshraj3223 asked this question in Q&A
Discussion options

You must be logged in to vote

Hi,

You can just rewrite this function to return a single bool value if the speech was detected:

silero-vad/utils_vad.py

Lines 119 to 130 in 7c671a7

def get_speech_timestamps(audio: torch.Tensor,
model,
threshold: float = 0.5,
sampling_rate: int = 16000,
min_speech_duration_ms: int = 250,
min_silence_duration_ms: int = 100,
window_size_samples: int = 1536,
speech_pad_ms: int = 30,
return_seconds: bool = False,
visualize_probs: bool = False):
"""

If you do not want to mess with this function, you can just write a wrapper that provides a True value if there is more speech than the specified threshold.

Also note that th…

Replies: 2 comments 1 reply

Comment options

You must be logged in to vote
1 reply
@harshraj3223
Comment options

Answer selected by snakers4
Comment options

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants