How to turn the received audio through the AudioReceiveHandler
into mono and little-endian?
#1857
Unanswered
moeux
asked this question in
Questions and Help
Replies: 1 comment 4 replies
-
You should post your VOSK code too. I'd also suggest looking at Sphinx as a possible alternative. You need to track down the exact format expected by VOSK:
Honestly the lack of documentation for this library kinda sucks.. |
Beta Was this translation helpful? Give feedback.
4 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Question
Hi, I've been struggling with this for a few days, with no answers from the Discord server nor StackOverflow:
How can I turn the audio received through the
AudioReceiveHandler
into little-endian and mono?I'm using the Vosk API for speech recognition and their method to feed data into it only accepts
byte
,float
andshort
arrays.I've discovered the audio format in the
AudioReceiveHandler
which prints out this value:PCM_SIGNED 48000.0 Hz, 16 bit, stereo, 2 bytes/frame, big-endian
and was told that vosk only accepts mono and little-endian in order to work.Example Code
I've tried a few methods.
Converting it using
AudioSystem
.Exception
I've tried messing with the
ByteBuffer
. And tried to turn thebyte
s intoshort
s after I had seen how theOpusPacket#getAudioData()
worked, tried to reverse that step so to speak.After digging around even more I tried using encoded audio and
OpusPacket#decode()
directly.No exception on both tries but vosk was still returning empty results, a sign of not being the right format.
I really hope you guys can help me out. Thank you!
Beta Was this translation helpful? Give feedback.
All reactions