Support for Online Decoding #173

teowenshen · 2022-01-08T11:28:39Z

Is there an example like Kaldi for implementing online decoding, or loading audio from memory or an IO stream instead of disk? Otherwise, do you have any advice for loading audio from memory or an IO stream using the K2 framework?

The pretrained.py example uses the torchaudio.load function to read audio files from the disk. From my own digging, I think torchaudio does not support reading audio from memory or IO stream.

PyTorch Audio

Background:
I have created a working K2-based backend that reads in Mandarin audio files upon trigger signals and revert with transcripts, using the model and example from the AISHELL conformer-ctc method. Currently, my frontend still needs to save the audio file in .wav. I am exploring the possibility for my front-end to pass the raw wav audio through a socket-client TCP connection directly to my K2-based backend.

danpovey · 2022-01-08T12:11:30Z

We are not going to work on that for a few months, I think. Right now we are focusing on the core of the online-decoding problem, particularly relating to RNN-T which is inherently easier to adapt to online-decoding than a transformer decoder. After that is done we will consider productization aspects. But I'm not sure when we will open-source the part responsible for ingesting the wav file. (For now it's not an issue as we haven't done it.)

teowenshen · 2022-01-10T03:43:10Z

I see. I remember it was mentioned somewhere that online decoding will not be Icefall's priority for a few months, but couldn't find the exact statement to check if "a few months" have passed.

Thank you so much also for being frank about the uncertainty regarding open-source support for online decoding. I guess I will be waiting on this Github for further news/decisions.

teowenshen closed this as completed Jan 13, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for Online Decoding #173

Support for Online Decoding #173

teowenshen commented Jan 8, 2022

danpovey commented Jan 8, 2022

teowenshen commented Jan 10, 2022

Support for Online Decoding #173

Support for Online Decoding #173

Comments

teowenshen commented Jan 8, 2022

danpovey commented Jan 8, 2022

teowenshen commented Jan 10, 2022