-
Hello. Any chance to put inference on GPU? After several tries i got error that this model is quantized, but maybe you can share non-quantized version? |
Beta Was this translation helpful? Give feedback.
Replies: 7 comments
-
Ah, sorry, forgot to mention. "silero-vad" |
Beta Was this translation helpful? Give feedback.
-
Well, these models are very small and they were designed to run on CPU So the main question is - why? |
Beta Was this translation helpful? Give feedback.
-
If these models have no significant difference in performance on GPU and quantized on CPU, it has no sense to publish non-quantized version of model. Just interesting, what kind of GPU did you use for performance measurements? |
Beta Was this translation helpful? Give feedback.
-
https://github.com/snakers4/silero-vad#performance-metrics
We tested on 1 CPU thread |
Beta Was this translation helpful? Give feedback.
-
Yes, i have read about CPU experiments. |
Beta Was this translation helpful? Give feedback.
-
I am not sure that we measured this quality difference directly
Well, since speed is not a bottleneck and quality most likely is as well |
Beta Was this translation helpful? Give feedback.
-
Will close this for now |
Beta Was this translation helpful? Give feedback.
Well, these models are very small and they were designed to run on CPU
Running them on GPU will not really provide any tangible speed / throughput benefits
We can of course publish the non-quantized versions of these models
But this would make the repo larger and we will have to maintain 2 versions of each model in parallel (so far we tried to keep this repo as minimal as possible)
So the main question is - why?