Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add modified beam search decoding for streaming inference with emformer model #327

Merged
merged 235 commits into from
Apr 22, 2022

Conversation

yaozengwei
Copy link
Collaborator

This PR continues from the closed one ##321. I merged k2-fsa/master branch into my current branch.
I add the modified beam search decoding for streaming inference with emformer model, in the recipe transducer_emformer.
The class FeatureExtractionStream in transducer_emformer/streaming_feature_extractor.py use different attributes for different decoding methods.
The function decoding_result() in class FeatureExtractionStream is used to get current decoding result.

pkufool and others added 30 commits February 6, 2022 18:22
danpovey and others added 24 commits April 11, 2022 21:00
Update results; will further update this before merge
Add results for mixed precision with max-duration 300
* initial commit

* support download, data prep, and fbank

* on-the-fly feature extraction by default

* support BPE based lang

* support HLG for BPE

* small fix

* small fix

* chunked feature extraction by default

* Compute features for GigaSpeech by splitting the manifest.

* Fixes after review.

* Split manifests into 2000 pieces.

* set audio duration mismatch tolerance to 0.01

* small fix

* add conformer training recipe

* Add conformer.py without pre-commit checking

* lazy loading and use SingleCutSampler

* DynamicBucketingSampler

* use KaldifeatFbank to compute fbank for musan

* use pretrained language model and lexicon

* use 3gram to decode, 4gram to rescore

* Add decode.py

* Update .flake8

* Delete compute_fbank_gigaspeech.py

* Use BucketingSampler for valid and test dataloader

* Update params in train.py

* Use bpe_500

* update params in decode.py

* Decrease num_paths while CUDA OOM

* Added README

* Update RESULTS

* black

* Decrease num_paths while CUDA OOM

* Decode with post-processing

* Update results

* Remove lazy_load option

* Use default `storage_type`

* Keep the original tolerance

* Use split-lazy

* black

* Update pretrained model

Co-authored-by: Fangjun Kuang <[email protected]>
* Add LG decoding

* Add log weight pushing

* Minor fixes
@yaozengwei yaozengwei closed this Apr 21, 2022
@yaozengwei yaozengwei reopened this Apr 22, 2022
@csukuangfj csukuangfj merged commit b3e6bf6 into k2-fsa:streaming Apr 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants