Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question: configure voice detection delay/debounce for transcribe-stream? #61

Open
hiinaspace opened this issue Jul 28, 2021 · 0 comments

Comments

@hiinaspace
Copy link

Hi, I'm using voice2json transcribe-stream with short one-word commands (to control a multirotor drone, e.g. "left" "right" "up"). Ideally I'd like the detector to respond as soon as possible after a word, but currently voice2json seems to wait a minimum of 2 seconds after it detects a voice to pass the audio into the transcriber, given by the 'end time' of the tokens object. Furthermore, if there's significant background noise (say, a buzzing quadcopter), voice2json continues to record for up to 15 seconds before passing back the audio for transcription and emitting the json line.

Is there any way to configure the min/max delay for commands? I tried the --timeout option, but even with --timeout 0 the latency from utterance to json line seems the same.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant