Releases: k2-fsa/sherpa-onnx
Releases · k2-fsa/sherpa-onnx
Release v1.9.25
What's Changed
- Add node-addon-api for VAD by @csukuangfj in #864
- Fix node addon tests by @csukuangfj in #865
- Add Android APKs for NeMo CTC models. by @csukuangfj in #866
- Add streaming CTC ASR APIs for node-addon-api by @csukuangfj in #867
- Add non-streaming ASR APIs for node-addon-api by @csukuangfj in #868
- Compiler Error and Minor Bug fix by @manickavela29 in #870
- Add TTS for node-addon-api by @csukuangfj in #871
- Add spoken language identification for node-addon-api by @csukuangfj in #872
- Refactor node-addon-api to remove duplicate. by @csukuangfj in #873
- Add speaker identification APIs for node-addon-api by @csukuangfj in #874
- Add audio tagging APIs for node-addon-api by @csukuangfj in #875
- Support adding puncutations to text for node-addon-api by @csukuangfj in #876
- Add keyword spotting API for node-addon-api by @csukuangfj in #877
- Fix sherpa-onnx-node-version in node examples by @csukuangfj in #879
- Update CMakeLists.txt by @linziguan in #881
- Fix Java API examples by @csukuangfj in #883
- Fix a typo in jni by @csukuangfj in #885
- Add tail_paddings to Whisper C API. by @csukuangfj in #886
New Contributors
- @linziguan made their first contribution in #881
Full Changelog: v1.9.24...v1.9.25
Release v1.9.24
What's Changed
- Add CTC HLG decoding for JNI by @csukuangfj in #810
- Add function 'tolowerUnicode' in sherpa-onnx-microphone (fix #791) by @daniel-dona in #812
- Add Java API for text-to-speech by @csukuangfj in #811
- Adding temperature scaling on Joiner logits: by @KarelVesely84 in #789
- Fix building wheels for macOS by @csukuangfj in #814
- Fix C# to support Chinese tts models using jieba by @csukuangfj in #815
- Fix a bug for offline paraformer by @csukuangfj in #816
- Add Java API for spoken language identification with whisper multilingual models by @csukuangfj in #817
- Add Java and Kotlin API for punctuation models by @csukuangfj in #818
- Add Java API for audio tagging by @csukuangfj in #820
- Add Java API for speaker identification by @csukuangfj in #822
- Fix typos in JNI TTS by @csukuangfj in #824
- Begin to add node-addon-api for sherpa-onnx by @csukuangfj in #826
- Publish node-addon-api wrapper for sherpa-onnx as npm packages by @csukuangfj in #829
- Update 3dspeaker/export-onnx.py by @chiiyeh in #836
- Upload two more 3d-speaker models by @csukuangfj in #837
- Publish npm package with node-addon-api for Windows by @csukuangfj in #838
- Add links to pre-built APKs and pre-trained models to README. by @csukuangfj in #840
- Publish node-addon-api npm package for linux arm64 by @csukuangfj in #841
- Export NeMo FastConformer Hybrid Transducer-CTC Large Streaming to ONNX. by @csukuangfj in #843
- Export NeMo FastConformer Hybrid Transducer Large Streaming to ONNX by @csukuangfj in #844
- Export non-streaming NeMo faster conformer hybrid transducer and ctc to sherpa-onnx by @csukuangfj in #847
- Add C++ support for non-streaming NeMo fast conformer hybrid transducer ctc (the ctc branch) by @csukuangfj in #848
- Add C++ runtime for non-streaming faster conformer transducer from NeMo. by @csukuangfj in #854
- Solve the issue of missing the last sentence with punctuation by @yh646492956 in #856
- Add C++ support for streaming NeMo CTC models. by @csukuangfj in #857
- Add more streaming ASR methods for node-addon-api by @csukuangfj in #860
- Fix Python TTS examples for models using jieba. by @csukuangfj in #861
- Add Speaker ID demo for C# by @csukuangfj in #862
New Contributors
- @daniel-dona made their first contribution in #812
- @yh646492956 made their first contribution in #856
Full Changelog: v1.9.23...v1.9.24
Release v1.9.23
What's Changed
- fix a typo in building language ID apk by @csukuangfj in #795
- Add jieba for Chinese TTS models by @csukuangfj in #797
- Increase CED's max frame length to 3000 by @csukuangfj in #798
- Fix the last character not being recognized for streaming paraformer … by @csukuangfj in #799
- Refactor TTS Android code to support jieba for Chinese TTS models by @csukuangfj in #800
- wget 续传 by @bubao in #801
- Refactor the JNI interface to make it more modular and maintainable by @csukuangfj in #802
- Fix CI tests by @csukuangfj in #804
- Refactor Java APIs by @csukuangfj in #806
- Add Java API for non-streaming ASR by @csukuangfj in #807
- Add dict_dir arg to c api to support Chinese TTS models using jieba by @csukuangfj in #809
Full Changelog: v1.9.22...v1.9.23
v1.9.22
What's Changed
- Replace torchaudio with soundfile in python-api-examples by @gtf35 in #765
- Add C API for punctuation by @csukuangfj in #768
- Add Kotlin API for audio tagging by @csukuangfj in #770
- Adding warm up for Zipformer2 by @manickavela29 in #766
- Fix display for sherpa-onnx-microphone by @csukuangfj in #773
- Fix code style issues by @csukuangfj in #774
- Add score function to speaker identification by @chiiyeh in #775
- Add Android demo for audio tagging by @csukuangfj in #776
- Add WearOS demo for audio tagging by @csukuangfj in #777
- Add JNI support for spoken language identification by @csukuangfj in #782
- Add Android demo for spoken language identification using Whisper multilingual models by @csukuangfj in #783
- Support CED models by @csukuangfj in #792
- Add Python API example for CED audio tagging. by @csukuangfj in #793
- Release v1.9.22 by @csukuangfj in #794
New Contributors
Full Changelog: v1.9.19...v1.9.22
v1.9.19
v1.9.18
What's Changed
- Fix building OpenFst on Windows. by @csukuangfj in #744
- Fix go API examples with portaudio on Windows. by @csukuangfj in #746
- Support audio tagging using zipformer by @csukuangfj in #747
- Add C++ microphone examples for audio tagging by @csukuangfj in #749
- Add SHERPA_ONNX_GITHUB by @bubao in #750
- Fix a bug in mean calculation of 'ys_probs' by @aask1357 in #748
- Add Python API and Python examples for audio tagging by @csukuangfj in #753
- Add C API for audio tagging by @csukuangfj in #754
- [feature] Configurable padding length by @manickavela29 in #755
- Use batch size 1 in generating subtitles. by @csukuangfj in #756
- Fix WebAssembly for kws by @csukuangfj in #758
- Support adding punctuations to the speech recogntion result by @csukuangfj in #761
- Add Python API for punctuation models. by @csukuangfj in #762
- Release v1.9.18 by @csukuangfj in #763
New Contributors
- @bubao made their first contribution in #750
- @aask1357 made their first contribution in #748
- @manickavela29 made their first contribution in #755
Full Changelog: v1.9.17...v1.9.18
punctuation-models
Use batch size 1 in generating subtitles. (#756)
audio-tagging-models
v1.9.17
What's Changed
- Support heteronyms in Chinese TTS by @csukuangfj in #738
- Add VAD examples using ALSA for recording by @csukuangfj in #739
- Fix releasing GIL by @csukuangfj in #741
- Support Chinese heteronyms on Android for TTS. by @csukuangfj in #742
Full Changelog: v1.9.16...v1.9.17
v1.9.16
What's Changed
- Fix building wasm in CI by @csukuangfj in #720
- Add more piper models for text-to-speech by @csukuangfj in #725
- Fix microphone privacy config by @yujinqiu in #727
- Add language identification swiftui demo by @yujinqiu in #729
- Add HLG decoding for streaming CTC models by @csukuangfj in #731
- Add C API for streaming HLG decoding by @csukuangfj in #734
- return timestamps for WebAssembly by @csukuangfj in #737
Full Changelog: v1.9.15...v1.9.16