Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support multilingual whisper models #274

Merged
merged 8 commits into from
Aug 15, 2023

Conversation

csukuangfj
Copy link
Collaborator

Note you can use

--whisper-language

to specify the spoken language in the input audio file or leave it empty to let the code infer the language from the input file

and

--whisper-task=transcribe or --whisper-task=translate

to do transcribe or translate.

@csukuangfj csukuangfj merged commit f709c95 into k2-fsa:master Aug 15, 2023
132 of 142 checks passed
@csukuangfj csukuangfj deleted the whisper-multilingual branch August 15, 2023 16:29
@csukuangfj
Copy link
Collaborator Author

FYI:
There is a huggingface space to show Next-gen Kaldi + Whisper models for speech recognition. Please see
https://huggingface.co/spaces/k2-fsa/automatic-speech-recognition-with-whisper

image

image

@pukatana
Copy link

--whisper-language can be used on Android ASR?

@csukuangfj
Copy link
Collaborator Author

--whisper-language can be used on Android ASR?

You can use it in the code. It is not exposed to users via the UI.

@pukatana
Copy link

--whisper-language can be used on Android ASR?

You can use it in the code. It is not exposed to users via the UI.
Hi @csukuangfj ,
Where can I use that? I couldn't find the code in JNI and android.
Could you give me some tips?

@csukuangfj
Copy link
Collaborator Author

--whisper-language can be used on Android ASR?

You can use it in the code. It is not exposed to users via the UI.
Hi @csukuangfj ,
Where can I use that? I couldn't find the code in JNI and android.
Could you give me some tips?

You can follow the way about how to add encoder and decoder to add the language option.

@pukatana
Copy link

@csukuangfj , Thanks for your kind reply.
I found the options, but some problems to build the JNI.
If you have time, please update the JNI libs with language options.

@csukuangfj
Copy link
Collaborator Author

please show error logs of your problem.

@pukatana
Copy link

pukatana commented Dec 23, 2023

I'm trying to build the project under the company's proxy.
So, there is a simple error.
error: downloading 'https://github.com/kkm000/openfst/archive/refs/tags/win/1.6.5.1.tar.gz' failed when build the project on Android Studio.
I'd be grateful if you provide the updated libs.

@pukatana
Copy link

When I try using downloaded files, I got this error.

FAILED: openfst-populate-prefix/src/openfst-populate-stamp/openfst-populate-patch D:/shepra/sherpa-onnx-master/android/SherpaOnnxVadAsr/app/.cxx/Debug/2g631h5d/x86/_deps/openfst-subbuild/openfst-populate-prefix/src/openfst-populate-stamp/openfst-populate-patch
cmd.exe /C "cd /D D:\shepra\sherpa-onnx-master\android\SherpaOnnxVadAsr\app.cxx\Debug\2g631h5d\x86_deps\openfst-src && sed -i.bak s/enable_testing()//g src/CMakeLists.txt && sed -i.bak s/add_subdirectory(test)//g src/CMakeLists.txt && sed -i.bak /message/d src/script/CMakeLists.txt && D:\android\sdk\cmake\3.22.1\bin\cmake.exe -E touch D:/shepra/sherpa-onnx-master/android/SherpaOnnxVadAsr/app/.cxx/Debug/2g631h5d/x86/_deps/openfst-subbuild/openfst-populate-prefix/src/openfst-populate-stamp/openfst-populate-patch"
'sed' is not recognized as an internal or external command,
operable program or batch file.
ninja: build stopped: subcommand failed.

@csukuangfj
Copy link
Collaborator Author

When I try using downloaded files, I got this error.

FAILED: openfst-populate-prefix/src/openfst-populate-stamp/openfst-populate-patch D:/shepra/sherpa-onnx-master/android/SherpaOnnxVadAsr/app/.cxx/Debug/2g631h5d/x86/_deps/openfst-subbuild/openfst-populate-prefix/src/openfst-populate-stamp/openfst-populate-patch cmd.exe /C "cd /D D:\shepra\sherpa-onnx-master\android\SherpaOnnxVadAsr\app.cxx\Debug\2g631h5d\x86_deps\openfst-src && sed -i.bak s/enable_testing()//g src/CMakeLists.txt && sed -i.bak s/add_subdirectory(test)//g src/CMakeLists.txt && sed -i.bak /message/d src/script/CMakeLists.txt && D:\android\sdk\cmake\3.22.1\bin\cmake.exe -E touch D:/shepra/sherpa-onnx-master/android/SherpaOnnxVadAsr/app/.cxx/Debug/2g631h5d/x86/_deps/openfst-subbuild/openfst-populate-prefix/src/openfst-populate-stamp/openfst-populate-patch" 'sed' is not recognized as an internal or external command, operable program or batch file. ninja: build stopped: subcommand failed.

Our doc is for Linux and macOS.

If you really want to use Windows, please refer to the following colab notebook
https://github.com/k2-fsa/colab/blob/master/sherpa-onnx/build_sherpa_onnx_for_android.ipynb
to generate the required libraries and then use them in Android Studio.

@pukatana
Copy link

Should I build so files on Linux not using Android Studio?

@csukuangfj
Copy link
Collaborator Author

csukuangfj commented Dec 24, 2023

Should I build so files on Linux not using Android Studio?

Yes, yor are right.

But Android Studio is needed if you need to build APKs.

@pukatana
Copy link

pukatana commented Dec 24, 2023

Ok, I see.
I only found the language option when initialize the model config.
And is it possible to use --language option when decode the multilingual whisper model?

@csukuangfj
Copy link
Collaborator Author

Ok, I see. I only found the language option when initialize the model config. And is it possible to use --language option when decode the multilingual whisper model?

Yes, it is possible. If you provide --whisper-language="", i.e., if you don't specify --whisper-language at all and use its default value, then it will detect the language in the audio automatically.
If you want to specify the language at the decoding time, you have to change the code. Fortunately, it is just a tiny change.

The related code is at

Instead of reading the language from the config, you can pass it as a function argument.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants