Add C++ support for streaming NeMo CTC models. #857

csukuangfj · 2024-05-10T08:18:22Z

Following #843

tempops · 2024-05-15T01:41:25Z

Hello, thank you for the speedy response and the model export support! I tried the online-nemo-ctc-decode-files.py with the 480.ms model but the response isn't generated real-time, I assume as it is online the text should be generated as it is decoded.

I also noticed a few errors, is it due to the model itself? and is the 1040ms model better or 80ms model better

I also wanted to know if streaming transducer can be used with the streaming_server.py file, as it has a separate decoder and joiner. I tried using it but got an error:

/Users/runner/work/sherpa-onnx/sherpa-onnx/sherpa-onnx/csrc/online-transducer-model.cc:GetModelType:75 Unsupported model_type: EncDecHybridRNNTCTCBPEModel
/Users/runner/work/sherpa-onnx/sherpa-onnx/sherpa-onnx/csrc/online-transducer-model.cc:Create:116 Unknown model type in online transducer!
zsh: segmentation fault python3 speech-recognition-from-microphone.py

csukuangfj · 2024-05-15T01:44:48Z

I tried the online-nemo-ctc-decode-files.py with the 480.ms model but the response isn't generated real-time

It is decoding files. What do you mean by real-time?

I also noticed a few errors, is it due to the model itself?

Could you tell us what errors you have noticed?

and is the 1040ms model better or 80ms model better

What do you mean by better?

I also wanted to know if streaming transducer can be used with the streaming_server.py file,

It has not been implemented yet. Will support it this week.

tempops · 2024-05-15T03:27:33Z

Hello,
Thank you for the quick response.

It is decoding files. What do you mean by real-time?
By real-time I mean text output in streaming. When I ran the online-nemo-ctc-decode-files.py it only printed the output at end of transcription
Could you tell us what errors you have noticed?
By Errors I mean incorrect transcription of words, mis-spellings and repeated letters which can lead to a high WER
What do you mean by better?
When I looked at Nemo documentation about fast conformer (here: link)) I saw that the 80 480 and 1040ms is the cache aware windows for the model to decode (here: link), with regards to this higher cache might lead to better transcriptions, I think but not sure about this. wanted to ask the same to you

It has not been implemented yet. Will support it this week.
Thank you! looking forward to using it!

csukuangfj · 2024-05-15T03:33:45Z

When I ran the online-nemo-ctc-decode-files.py it only printed the output at end of transcription

That is expected. We are decoding a file and it gives you the result once the file is decoded.

Please refer to our microphone examples and you can change them to support NeMo streaming ctc models and then you can see real-time output as you speak.

I also noticed a few errors, is it due to the model itself?

Yes, I think so.

and is the 1040ms model better or 80ms model better

In terms of accuracy, I think 1040ms is better.

In terms of latency, I think 80ms is better.

By the way, you can try the Android APK for NeMo streaming CTC models at

https://k2-fsa.github.io/sherpa/onnx/android/apk.html

APKs for the non-streaming NeMo CTC models can be found at
https://k2-fsa.github.io/sherpa/onnx/vad/apk-asr.html

csukuangfj added 5 commits May 10, 2024 13:29

Begin to add C++ support for streaming NeMo CTC models.

8eb2735

Finish the C++ code

de2a2e2

Merge remote-tracking branch 'dan/master' into nemo-streaming-ctc

ab2643c

fix style issues

c1af6ef

add missing files

a40e2c2

csukuangfj merged commit 46e4e5b into k2-fsa:master May 10, 2024
179 of 199 checks passed

csukuangfj deleted the nemo-streaming-ctc branch May 10, 2024 08:26

csukuangfj mentioned this pull request May 10, 2024

Export NeMo FastConformer Hybrid Transducer Large Streaming to ONNX #844

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add C++ support for streaming NeMo CTC models. #857

Add C++ support for streaming NeMo CTC models. #857

csukuangfj commented May 10, 2024

tempops commented May 15, 2024

csukuangfj commented May 15, 2024

tempops commented May 15, 2024

csukuangfj commented May 15, 2024

Add C++ support for streaming NeMo CTC models. #857

Add C++ support for streaming NeMo CTC models. #857

Conversation

csukuangfj commented May 10, 2024

tempops commented May 15, 2024

csukuangfj commented May 15, 2024

tempops commented May 15, 2024

csukuangfj commented May 15, 2024