❓ vad bad performance on child speech? #52

garymmi · 2021-04-01T08:18:59Z

garymmi
Apr 1, 2021

I have a 16k audio file of child speech
after vad, I got
start=0.875, dur=1.375
as picture

I used model='silero_vad' , call the api
speech_timestamps = get_speech_ts(wav_data, model, num_steps=4)

here is the audio file
https://drive.google.com/file/d/1yivG8OE77TyfJE_KL2IYl-Ltm5cFujnX/view?usp=sharing

could you help me to see what'wrong or teach me how to fine tune the parameters ?
thanks

Answered by snakers4

Apr 1, 2021

Several remarks here

We do not have a lot of children's speech per se in the training data
This sample just looks too short to be cut by a VAD
We were planning to release an adaptive post-processing tweak that would make parameter tuning obsolete
You can use this snippet to tune params, albeit it should be done on longer audio files, becase this one is too short to be meaninful imo

speech_timestamps = get_speech_ts(wav, model,
                                  num_samples_per_window=2000,
                                  num_steps=8,
                                  visualize_probs=True)

This snippet produces this image

View full answer

snakers4 · 2021-04-01T08:49:50Z

snakers4
Apr 1, 2021
Maintainer

Several remarks here

We do not have a lot of children's speech per se in the training data
This sample just looks too short to be cut by a VAD
We were planning to release an adaptive post-processing tweak that would make parameter tuning obsolete
You can use this snippet to tune params, albeit it should be done on longer audio files, becase this one is too short to be meaninful imo

speech_timestamps = get_speech_ts(wav, model,
                                  num_samples_per_window=2000,
                                  num_steps=8,
                                  visualize_probs=True)

This snippet produces this image

2 replies

snakers4 Apr 1, 2021
Maintainer

Also tuning this maye be a good idea

min_speech_samples - minimum speech chunk duration in samples
min_silence_samples - minimum silence duration in samples between to separate speech chunks

garymmi Apr 1, 2021
Author

I got it, thank you very much

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

❓ vad bad performance on child speech? #52

{{title}}

Replies: 1 comment 2 replies

{{title}}

{{title}}

{{title}}

Select a reply

❓ vad bad performance on child speech? #52

garymmi Apr 1, 2021

Replies: 1 comment · 2 replies

snakers4 Apr 1, 2021 Maintainer

snakers4 Apr 1, 2021 Maintainer

garymmi Apr 1, 2021 Author

garymmi
Apr 1, 2021

Replies: 1 comment 2 replies

snakers4
Apr 1, 2021
Maintainer

snakers4 Apr 1, 2021
Maintainer

garymmi Apr 1, 2021
Author