GPU inference #24

pKrysenko · 2021-01-07T06:36:19Z

pKrysenko
Jan 7, 2021

Hello. Any chance to put inference on GPU? After several tries i got error that this model is quantized, but maybe you can share non-quantized version?

Answered by snakers4

Jan 7, 2021

Ah, sorry, forgot to mention. "silero-vad"

Well, these models are very small and they were designed to run on CPU
Running them on GPU will not really provide any tangible speed / throughput benefits
We can of course publish the non-quantized versions of these models
But this would make the repo larger and we will have to maintain 2 versions of each model in parallel (so far we tried to keep this repo as minimal as possible)

So the main question is - why?

View full answer

pKrysenko · 2021-01-07T08:51:28Z

pKrysenko
Jan 7, 2021
Author

Which model exactly?

Ah, sorry, forgot to mention. "silero-vad"

0 replies

snakers4 · 2021-01-07T08:55:18Z

snakers4
Jan 7, 2021
Maintainer

Ah, sorry, forgot to mention. "silero-vad"

Well, these models are very small and they were designed to run on CPU
Running them on GPU will not really provide any tangible speed / throughput benefits
We can of course publish the non-quantized versions of these models
But this would make the repo larger and we will have to maintain 2 versions of each model in parallel (so far we tried to keep this repo as minimal as possible)

So the main question is - why?

0 replies

pKrysenko · 2021-01-07T09:04:11Z

pKrysenko
Jan 7, 2021
Author

If these models have no significant difference in performance on GPU and quantized on CPU, it has no sense to publish non-quantized version of model. Just interesting, what kind of GPU did you use for performance measurements?

0 replies

snakers4 · 2021-01-07T09:18:39Z

snakers4
Jan 7, 2021
Maintainer

what kind of GPU did you use for performance measurements?

https://github.com/snakers4/silero-vad#performance-metrics

All speed test were run on AMD Ryzen Threadripper 3960X using only 1 thread:

We tested on 1 CPU thread
We did not test on GPU

0 replies

pKrysenko · 2021-01-07T09:21:39Z

pKrysenko
Jan 7, 2021
Author

Yes, i have read about CPU experiments.
If it is possible, can you share for me non-quzntized version? I can make experiments on 1050ti, k80, and maybe V100. Also, i have jetson nano, maybe it will be interesting experiments for you

0 replies

snakers4 · 2021-01-07T09:27:36Z

snakers4
Jan 7, 2021
Maintainer

If these models have no significant difference in performance on GPU and quantized on CPU

I am not sure that we measured this quality difference directly
But I believe that we measured quality on the CPU version as well and used GPU only for training
Also with similar quantization techniques we compared STT performance and it was within 1 CER percentage point
So I believe since this task is much easier than STT the quality gap is negligible here

I can make experiments on 1050ti, k80, and maybe V100. Also, i have jetson nano, maybe it will be interesting experiments for you

Well, since speed is not a bottleneck and quality most likely is as well
I am not sure what can be achieved by running such test
I am 95% confident that the model will just be IO bound and GPUs will run with 10% utilization

0 replies

snakers4 · 2021-01-10T05:46:15Z

snakers4
Jan 10, 2021
Maintainer

Will close this for now
Please open another issue / discussion if you see valid GPU use-cases or anything else

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GPU inference #24

{{title}}

Replies: 7 comments

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

GPU inference #24

pKrysenko Jan 7, 2021

Replies: 7 comments

pKrysenko Jan 7, 2021 Author

snakers4 Jan 7, 2021 Maintainer

pKrysenko Jan 7, 2021 Author

snakers4 Jan 7, 2021 Maintainer

pKrysenko Jan 7, 2021 Author

snakers4 Jan 7, 2021 Maintainer

snakers4 Jan 10, 2021 Maintainer

pKrysenko
Jan 7, 2021

pKrysenko
Jan 7, 2021
Author

snakers4
Jan 7, 2021
Maintainer

pKrysenko
Jan 7, 2021
Author

snakers4
Jan 7, 2021
Maintainer

pKrysenko
Jan 7, 2021
Author

snakers4
Jan 7, 2021
Maintainer

snakers4
Jan 10, 2021
Maintainer