Skip to content

❓ Calling Silero VAD model from Huggingface datasets.map #385

Answered by snakers4
wjassim asked this question in Q&A
Discussion options

You must be logged in to vote

and make it run with multiprocessing

The correct low-level way to run VAD with multiple python processes is as follows:

  • Have a separately run init function, that is invoked at the start of EACH process. This function should load the VAD and set the number of CPU threads for PyTorch if necessary. Same for ONNX. Please note that both PyTorch and ONNX models are not python objects, but merely pointers to the underlying objects;

  • Python has may APIs for multiprocessing (Process, ProcessPool, ProcessPoolExecutor, etc etc). Many of them have a custom init function parameters. The key here is NOT to reuse the same pointer, but to create truly separate model instances for each process;

I’…

Replies: 1 comment 3 replies

Comment options

You must be logged in to vote
3 replies
@wjassim
Comment options

@snakers4
Comment options

@wjassim
Comment options

Answer selected by snakers4
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
help wanted Extra attention is needed
2 participants
Converted from issue

This discussion was converted from issue #384 on October 14, 2023 05:04.