Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support batching for kaldi compliant feature extraction functions #675

Open
saurabh-kataria opened this issue Jun 1, 2020 · 2 comments
Open

Comments

@saurabh-kataria
Copy link

🚀 Feature

batch dimension should be supported for kaldi complaint functions, for example, in torchaudio.compliance.kaldi.fbank

Motivation

Computation on GPU and use batches is essential

@mthrok
Copy link
Collaborator

mthrok commented Jun 1, 2020

Hi @saurabh-kataria

Thanks for submitting the feature request. This sounds reasonable request, but requires careful design.
As a starting point, can you describe how you would feed batched tensor?
like shape of the input tensor (what dimensions they represent) and how the function signature would change (if change is required).

@echocatzh
Copy link

echocatzh commented Sep 14, 2020

I also encountered this problem with compliance.kaldi.fbank. I hope torchaudio can add batch processing operations, such as limiting the input dimension to 3 dimensions, [batch, channel, samples], or adding a batch_first option, because when working on asr or kws, the batchsize is usually very Large, if only one audio can be processed at a time, will the efficiency be reduced?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants