Add streaming modified beam search #142

ezerhouni · 2022-09-29T12:37:02Z

This PR adds the following :

Modified beam search to streaming transducer

ezerhouni · 2022-09-29T12:37:44Z

@csukuangfj I need to test the branch (and add stuff to the CI job)

csukuangfj · 2022-09-29T12:48:12Z

@ezerhouni
Thanks!

Would you mind also picking up the following PR?

#119

The core part is almost finished.

csukuangfj · 2022-09-29T12:49:47Z

Could you also add modified_beam_search to
https://github.com/k2-fsa/sherpa/blob/master/sherpa/bin/conv_emformer_transducer_stateless2/beam_search.py

It can be added in a separate PR.

ezerhouni · 2022-09-29T13:06:03Z

@csukuangfj Added streaming modified beam search to conv_emformer. I will have a look for the other PR tomorrow. If I didn't push anything, feel free to ping me.

csukuangfj · 2022-09-29T13:10:00Z

@csukuangfj Added streaming modified beam search to conv_emformer. I will have a look for the other PR tomorrow. If I didn't push anything, feel free to ping me.

Thanks a lot! Could you also update the CI by changing the following lines?

You only need to add "modified_beam_search" to the list.

sherpa/.github/workflows/run-streaming-conformer-test.yaml

Line 55 in c235538

    
           decoding: ["greedy_search", "fast_beam_search", "fast_beam_search_nbest", "fast_beam_search_nbest_LG"]

sherpa/.github/workflows/run-streaming-conv-emformer-test.yaml

Line 55 in c235538

    
           decoding: ["greedy_search", "fast_beam_search", "fast_beam_search_nbest", "fast_beam_search_nbest_LG"]

sherpa/.github/workflows/run-wenetspeech-streaming-conformer-rnnt-test.yaml

Line 55 in c235538

    
           decoding: ["greedy_search", "fast_beam_search", "fast_beam_search_nbest", "fast_beam_search_nbest_LG"]

ezerhouni · 2022-09-29T13:13:21Z

@csukuangfj Yup, first I will debug the code locally and then update the CI :)
I will let you know !

ezerhouni · 2022-09-29T13:39:31Z

@csukuangfj When testing on CPU I am getting :
OSError: libtorch_hip.so: cannot open shared object file: No such file or directory

do you know if modified beam search needs to be on GPU ?

csukuangfj · 2022-09-29T13:42:26Z

@csukuangfj When testing on CPU I am getting : OSError: libtorch_hip.so: cannot open shared object file: No such file or directory

do you know if modified beam search needs to be on GPU ?

modified_beam_search is able to run on both CPU and GPU.

How did you install your PyTorch?

ezerhouni · 2022-09-29T13:43:57Z

@csukuangfj Never mind, wrong move on my side !

ezerhouni · 2022-09-29T14:38:32Z

@csukuangfj I tested the code for streaming_transducer but not for the conv_emformer. Somehow I get the following error when using the models from : https://huggingface.co/Zengwei/icefall-asr-librispeech-conv-emformer-transducer-stateless2-2022-07-05

  File "./sherpa/bin/conv_emformer_transducer_stateless2/streaming_server.py", line 239, in __init__
    self.model = RnntConvEmformerModel(nn_model_filename, device=device)
RuntimeError: Unrecognized data format
Exception raised from load at ../torch/csrc/jit/serialization/import.cpp:449 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x3e (0x7fa3e2f3abbe in /usr/local/lib/python3.7/site-packages/torch/lib/libc10.so)
frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, char const*) + 0x60 (0x7fa3e2f15ef9 in /usr/local/lib/python3.7/site-packages/torch/lib/libc10.so)
frame #2: torch::jit::load(std::string const&, c10::optional<c10::Device>, std::unordered_map<std::string, std::string, std::hash<std::string>, std::equal_to<std::string>, std::allocator<std::pair<std::string const, std::string> > >&) + 0x27a (0x7fa3326e071a in /usr/local/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)

csukuangfj · 2022-09-29T14:48:26Z

What is the command you are using for testing?

ezerhouni · 2022-09-29T15:25:01Z

  ./sherpa/bin/conv_emformer_transducer_stateless2/streaming_server.py \
  --port 6006 \
  --max-batch-size 50 \
  --max-wait-ms 5 \
  --max-active-connections 500 \
  --nn-pool-size 1 \
  --decoding-method "fast_beam_search" \
  --nn-model-filename /path/to/cpu-jit-epoch-30-avg-10-torch-1.10.0.pt \
  --bpe-model-filename /path/to/bpe.model

csukuangfj · 2022-09-29T15:37:17Z

What is the output of

ls -lh /path/to/cpu-jit-epoch-30-avg-10-torch-1.10.0.pt

Just want to make sure that you have downloaded the pretrained model using git lfs.

Also, are you using PyTorch >= 1.10.0 ?

ezerhouni · 2022-09-29T15:54:28Z

-rw-r--r-- 1 root root 134 Sep 29 14:27

Yes I have torch 1.12
I will dig more tomorrow, most likely a bug on my side tbh

csukuangfj · 2022-09-29T15:58:55Z

-rw-r--r-- 1 root root 134 Sep 29 14:27

The filesize of the model is only 134 bytes, which is too small.

I think you don't use

git lfs install
git clone xxxx

You need to use git lfs install to download the pretrained model as it is managed by GIT LFS.

ezerhouni · 2022-09-29T16:43:47Z

@csukuangfj Yup working, tested and fix the issue, everything should be good to go

csukuangfj · 2022-09-30T03:46:12Z

@ezerhouni
Thanks! Merging.

Add streaming modified beam search

27ec4b7

ezerhouni requested a review from csukuangfj September 29, 2022 12:37

ezerhouni changed the title ~~Add streaming modified beam search~~ [WIP] Add streaming modified beam search Sep 29, 2022

Add streaming modified beam search for conv_emformer

2b42407

Fix issue streaming modified beam search

1096195

ezerhouni changed the title ~~[WIP] Add streaming modified beam search~~ Add streaming modified beam search Sep 29, 2022

Fix issue modified beam search conv emformer

fa47323

Add get_texts in conv_emformer

d7e1b09

csukuangfj added the ready label Sep 29, 2022

csukuangfj approved these changes Sep 30, 2022

View reviewed changes

csukuangfj merged commit b163428 into k2-fsa:master Sep 30, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add streaming modified beam search #142

Add streaming modified beam search #142

ezerhouni commented Sep 29, 2022

ezerhouni commented Sep 29, 2022

csukuangfj commented Sep 29, 2022

csukuangfj commented Sep 29, 2022

ezerhouni commented Sep 29, 2022

csukuangfj commented Sep 29, 2022

ezerhouni commented Sep 29, 2022

ezerhouni commented Sep 29, 2022

csukuangfj commented Sep 29, 2022

ezerhouni commented Sep 29, 2022

ezerhouni commented Sep 29, 2022

csukuangfj commented Sep 29, 2022

ezerhouni commented Sep 29, 2022

csukuangfj commented Sep 29, 2022

ezerhouni commented Sep 29, 2022

csukuangfj commented Sep 29, 2022

ezerhouni commented Sep 29, 2022

csukuangfj commented Sep 30, 2022

Add streaming modified beam search #142

Add streaming modified beam search #142

Conversation

ezerhouni commented Sep 29, 2022

ezerhouni commented Sep 29, 2022

csukuangfj commented Sep 29, 2022

csukuangfj commented Sep 29, 2022

ezerhouni commented Sep 29, 2022

csukuangfj commented Sep 29, 2022

ezerhouni commented Sep 29, 2022

ezerhouni commented Sep 29, 2022

csukuangfj commented Sep 29, 2022

ezerhouni commented Sep 29, 2022

ezerhouni commented Sep 29, 2022

csukuangfj commented Sep 29, 2022

ezerhouni commented Sep 29, 2022

csukuangfj commented Sep 29, 2022

ezerhouni commented Sep 29, 2022

csukuangfj commented Sep 29, 2022

ezerhouni commented Sep 29, 2022

csukuangfj commented Sep 30, 2022