Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for Online Decoding #173

Closed
teowenshen opened this issue Jan 8, 2022 · 2 comments
Closed

Support for Online Decoding #173

teowenshen opened this issue Jan 8, 2022 · 2 comments

Comments

@teowenshen
Copy link
Contributor

Is there an example like Kaldi for implementing online decoding, or loading audio from memory or an IO stream instead of disk? Otherwise, do you have any advice for loading audio from memory or an IO stream using the K2 framework?

The pretrained.py example uses the torchaudio.load function to read audio files from the disk. From my own digging, I think torchaudio does not support reading audio from memory or IO stream.

PyTorch Audio

Background:
I have created a working K2-based backend that reads in Mandarin audio files upon trigger signals and revert with transcripts, using the model and example from the AISHELL conformer-ctc method. Currently, my frontend still needs to save the audio file in .wav. I am exploring the possibility for my front-end to pass the raw wav audio through a socket-client TCP connection directly to my K2-based backend.

@danpovey
Copy link
Collaborator

danpovey commented Jan 8, 2022

We are not going to work on that for a few months, I think. Right now we are focusing on the core of the online-decoding problem, particularly relating to RNN-T which is inherently easier to adapt to online-decoding than a transformer decoder. After that is done we will consider productization aspects. But I'm not sure when we will open-source the part responsible for ingesting the wav file. (For now it's not an issue as we haven't done it.)

@teowenshen
Copy link
Contributor Author

I see. I remember it was mentioned somewhere that online decoding will not be Icefall's priority for a few months, but couldn't find the exact statement to check if "a few months" have passed.

Thank you so much also for being frank about the uncertainty regarding open-source support for online decoding. I guess I will be waiting on this Github for further news/decisions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants