Releases · k2-fsa/sherpa-onnx

22 May 06:47

901b3f0

source-separation-models

audio_example.wav is converted from
https://github.com/deezer/spleeter/blob/v1.4.0/audio_example.mp3
with the following command

sox audio_example.mp3 audio_example.wav

Assets 8

15 May 08:09

github-actions

v1.12.0

02c902a

v1.12.0 Latest

Latest

What's Changed

Fix building wheels for macOS by @csukuangfj in #2192
Show verbose logs in homophone replacer by @csukuangfj in #2194
Fix displaying streaming speech recognition results for Python. by @csukuangfj in #2196
Add real-time speech recognition example for SenseVoice. by @csukuangfj in #2197
docs: add Open-XiaoAI KWS project by @idootop in #2198
Add C++ example for streaming ASR with SenseVoice. by @csukuangfj in #2199
Add C++ example for real-time ASR with nvidia/parakeet-tdt-0.6b-v2. by @csukuangfj in #2201
Add a link to YouTube video including sherpa-onnx. by @csukuangfj in #2202
Support sending is_eof for online websocket server. by @csukuangfj in #2204
Add alsa-based streaming ASR example for sense voice. by @csukuangfj in #2207
Support homophone replacer in Android asr demo. by @csukuangfj in #2210
Add a Go implementation of the TTS generation callback by @xiaokuang95 in #2213
Add Android demo for real-time ASR with non-streaming ASR models. by @csukuangfj in #2214
Expose dither for JNI by @esavin in #2215
Add nodejs example for parakeet-tdt-0.6b-v2. by @csukuangfj in #2219
Add script to build APK for simulated-streaming-asr. by @csukuangfj in #2220
Release v1.12.0 by @csukuangfj in #2221

New Contributors

@idootop made their first contribution in #2198
@esavin made their first contribution in #2215

Full Changelog: v1.11.5...v1.12.0

Contributors

esavin, csukuangfj, and 2 other contributors

Assets 85

checksum.txt

9.09 KB 2025-05-28T01:30:48Z
sherpa-onnx-1.12.0-rknn.aar

19.2 MB 2025-05-15T08:40:19Z
sherpa-onnx-1.12.0.aar

34.9 MB 2025-05-15T08:29:49Z
sherpa-onnx-non-streaming-asr-x64-v1.12.0.exe

17.4 MB 2025-05-15T08:16:20Z
sherpa-onnx-non-streaming-asr-x86-v1.12.0.exe

14.8 MB 2025-05-15T08:42:42Z
sherpa-onnx-non-streaming-tts-x64-v1.12.0.exe

17.2 MB 2025-05-15T08:16:22Z
sherpa-onnx-non-streaming-tts-x86-v1.12.0.exe

14.6 MB 2025-05-15T08:42:44Z
sherpa-onnx-static-link-onnxruntime-1.12.0.aar

29.3 MB 2025-05-15T08:50:09Z
sherpa-onnx-streaming-asr-x64-v1.12.0.exe

17.5 MB 2025-05-15T08:16:18Z
sherpa-onnx-streaming-asr-x86-v1.12.0.exe

14.9 MB 2025-05-15T08:42:40Z
Source code (zip)

2025-05-15T08:03:17Z
Source code (tar.gz)

2025-05-15T08:03:17Z

08 May 03:40

csukuangfj

v1.11.5

baec2da

v1.11.5

What's Changed

export parakeet-tdt-0.6b-v2 to sherpa-onnx by @csukuangfj in #2180
Add C++ runtime for parakeet-tdt-0.6b-v2. by @csukuangfj in #2181
Avoid NaN in feature normalization. by @csukuangfj in #2186
Release v1.11.5 by @csukuangfj in #2187

Full Changelog: v1.11.4...v1.11.5

Contributors

csukuangfj

Assets 85

01 May 03:39

csukuangfj

v1.11.4

abc4daa

v1.11.4

What's Changed

Disable strict hotword matching mode for offline transducer by @vsd-vector in #1837
Comment refinement: Add note about vocoder file for matcha TTS config by @HaoWang0101 in #2106
Fix a typo in the JNI for Android. by @csukuangfj in #2108
Generate subtitles with FireRedAsr models by @csukuangfj in #2112
Use manylinux_2_28_x86_64 to build linux gpu for sherpa-onnx by @csukuangfj in #2123
Support running sherpa-onnx with RK NPU on Android by @csukuangfj in #2124
Fix building for HarmonyOS by @csukuangfj in #2125
cmake build, configurable from env by @KarelVesely84 in #2115
Expose dither in python API by @nshmyrev in #2127
Add support for GigaAM-CTC-v2 by @rominf in #2135
Support Giga AM transducer V2 by @csukuangfj in #2136
Export kokoro 1.0 int8 models by @csukuangfj in #2137
Upload more onnx ASR models by @csukuangfj in #2141
Fix building for open harmonyOS by @csukuangfj in #2142
online-transducer: reset the encoder toghter with 2 previous output symbols (non-blank) by @KarelVesely84 in #2129
Fix punctuations for kokoro tts 1.1-zh. by @csukuangfj in #2146
Fix setting OnlineModelConfig in Java API by @csukuangfj in #2147
Support decoding multiple streams in Java API. by @csukuangfj in #2149
Support replacing homonphonic phrases by @csukuangfj in #2153
Add C and CXX API for homophone replacer by @csukuangfj in #2156
Add JavaScript API (WASM) for homophone replacer by @csukuangfj in #2157
Add JavaScript API (node-addon) for homophone replacer by @csukuangfj in #2158
Fix building without TTS by @csukuangfj in #2159
Add homonphone replacer example for Python API. by @csukuangfj in #2161
More fix for building without tts by @csukuangfj in #2162
Add Swift API for homophone replacer. by @csukuangfj in #2164
Add C# API for homophone replacer by @csukuangfj in #2165
Add Kotlin and Java API for homophone replacer by @csukuangfj in #2166
Add Dart API for homophone replacer by @csukuangfj in #2167
Add Go API for homophone replacer by @csukuangfj in #2168
Release v1.11.4 by @csukuangfj in #2169

New Contributors

@HaoWang0101 made their first contribution in #2106

Full Changelog: v1.11.3...v1.11.4

Contributors

nshmyrev, rominf, and 4 other contributors

Assets 85

27 Apr 06:43

csukuangfj

hr-files

e328002

hr-files

replace.fst is generated from
https://colab.research.google.com/drive/1jEaS3s8FbRJIcVQJv2EQx19EM_mnuARi?usp=sharing

If you don't have access to the colab notebook, here is the code for generating replace.fst:

import pynini
from pynini.lib import utf8, byte
from pynini import cdrewrite

sigma = utf8.VALID_UTF8_CHAR.star

rule1 = pynini.cross("dan1ni2er3bo1wei2", "丹尼尔·波维")
rule10 = pynini.cross("dan1ni2er3bo1wei4", "丹尼尔·波维")
rule2 = pynini.cross('dou4dou4', '豆豆')
rule3 = pynini.cross('cheng2cheng2', '橙橙')
rule30 = pynini.cross('chen2chen2', '橙橙')
rule4 = pynini.cross('qiao2qiao2', '峤峤')
rule5 = pynini.cross('qiu2qiu2', '球球')
rule6 = pynini.cross('lin2mei3li4', '林美丽')
rule7 = pynini.cross('guo3guo3', '果果')
rule8 = pynini.cross('miao2miao2', '苗苗')
rule9 = pynini.cross('xuan2jie4', '玄戒')
rule10 = pynini.cross('xuan2jie4xin1pian1', '玄戒芯片')
rule11 = pynini.cross('xuan2jie4xing1pian1', '玄戒芯片')
rule12 = pynini.cross('xuan2jie4xin1pian1', '玄戒芯片')
rule13 = pynini.cross('xuan2jie4xing1pian1', '玄戒芯片')


rule = (rule1 | rule10 | rule2 | rule3 | rule30 | rule4 | rule5 | rule6 | rule7 | rule8 | rule9 | rule10 | rule11 | rule12 | rule13).optimize()
rule = cdrewrite(rule, "", "", sigma)

rule.write('replace.fst')

Note that you need to use

pip install --only-binary :all: pynini

to install pynini

Assets 6

03 Apr 08:20

csukuangfj

v1.11.3

31ced58

v1.11.3

What's Changed

fix vits dict dir config by @amutu in #2036
fix case by @amutu in #2037
Fix building wheels for RKNN by @csukuangfj in #2041
缩放因子应该是32767？ by @yourengod in #2056
Fix length scale for kokoro tts by @csukuangfj in #2060
Allow building repository as CMake subdirectory by @niansa in #2059
Export silero_vad v4 to RKNN by @csukuangfj in #2067
修复 DirectML 支持 by @endink in #2066
Fix building aar to include speech denoiser by @csukuangfj in #2069
Add CXX API for VAD by @csukuangfj in #2077
Add C++ runtime for silero_vad with RKNN by @csukuangfj in #2078
Refactor rknn code by @csukuangfj in #2079
Fix building for android by @csukuangfj in #2081
Add C++ and Python API for Dolphin CTC models by @csukuangfj in #2085
Add Kotlin and Java API for Dolphin CTC models by @csukuangfj in #2086
Add C and CXX API for Dolphin CTC models by @csukuangfj in #2088
Preserve more context after endpointing in transducer by @vsd-vector in #2061
Add C# API for Dolphin CTC models by @csukuangfj in #2089
Add Go API for Dolphin CTC models by @csukuangfj in #2090
Add Swift API for Dolphin CTC models by @csukuangfj in #2091
Add Javascript (WebAssembly) API for Dolphin CTC models by @csukuangfj in #2093
Add Javascript (node-addon) API for Dolphin CTC models by @csukuangfj in #2094
Add Dart API for Dolphin CTC models by @csukuangfj in #2095
Add Pascal API for Dolphin CTC models by @csukuangfj in #2096
Release v1.11.3 by @csukuangfj in #2097

New Contributors

@amutu made their first contribution in #2036
@yourengod made their first contribution in #2056
@niansa made their first contribution in #2059

Full Changelog: v1.11.2...v1.11.3

Contributors

endink, amutu, and 4 other contributors

Assets 84

21 Mar 06:07

csukuangfj

v1.11.2

419f7fe

v1.11.2

What's Changed

Fix CI tests. by @csukuangfj in #2016
Publish jar for more java versions by @csukuangfj in #2017
add alsa example for vad+offline asr by @csukuangfj in #2020
Support cuda12 and cudnn8 for Linux aarch64. by @csukuangfj in #2021
Update README to include more projects using sherpa-onnx by @csukuangfj in #2022
Fix a bug in vad.reset() by @csukuangfj in #2023
Fix Matcha + vocos for Android by @csukuangfj in #2024
Fix crash in Android tts engine demo. by @csukuangfj in #2029
Fix build script by @sienaiwun in #2033
fix static linking by @sangeet2020 in #2032
Release v1.11.2 by @csukuangfj in #2035

New Contributors

@sienaiwun made their first contribution in #2033

Full Changelog: v1.11.1...v1.11.2

Contributors

sienaiwun, csukuangfj, and sangeet2020

Assets 82

17 Mar 09:33

csukuangfj

v1.11.1

bdf84a7

v1.11.1

What's Changed

Export vocos to sherpa-onnx by @csukuangfj in #2012
Add C++ runtime for vocos by @csukuangfj in #2014
Release v1.11.1 by @csukuangfj in #2015

Full Changelog: v1.11.0...v1.11.1

Contributors

csukuangfj

Assets 82

16 Mar 07:29

csukuangfj

v1.11.0

f110c77

v1.11.0

What's Changed

Fix building wheels for Python 3.7 by @csukuangfj in #1933
Add Kotlin and Java API for online punctuation models by @csukuangfj in #1936
Add Kokoro v1.1-zh by @csukuangfj in #1942
Support RKNN for Zipformer CTC models. by @csukuangfj in #1948
Add transducer modified_beam_search for RKNN. by @csukuangfj in #1949
Update README to include projects that is using sherpa-onnx by @csukuangfj in #1956
Limit number of tokens per second for whisper. by @csukuangfj in #1958
Ebranchformer by @KarelVesely84 in #1951
Test using sherpa-onnx as a cmake subproject by @csukuangfj in #1961
Add C++ demo for VAD+non-streaming ASR by @csukuangfj in #1964
Export gtcrn models to sherpa-onnx by @csukuangfj in #1975
c-api add wave write to buffer. by @cjsdurj in #1962
add SherpaOnnxOfflineRecognizerSetConfig binding for go by @franck-li in #1976
Add C++ runtime for speech enhancement GTCRN models by @csukuangfj in #1977
Add Python API for speech enhancement GTCRN models by @csukuangfj in #1978
Add C API for speech enhancement GTCRN models by @csukuangfj in #1984
Add CXX API for speech enhancement GTCRN models by @csukuangfj in #1986
Add Swift API for speech enhancement GTCRN models by @csukuangfj in #1989
Add C# API for speech enhancement GTCRN models by @csukuangfj in #1990
Add Go API for speech enhancement GTCRN models by @csukuangfj in #1991
Add Pascal API for speech enhancement GTCRN models by @csukuangfj in #1992
Add Dart API for speech enhancement GTCRN models by @csukuangfj in #1993
Add JavaScript (node-addon) API for speech enhancement GTCRN models by @csukuangfj in #1996
Add WebAssembly (WASM) for speech enhancement GTCRN models by @csukuangfj in #2002
Add JavaScript API (wasm) for speech enhancement GTCRN models by @csukuangfj in #2007
Add Kotlin API for speech enhancement GTCRN models by @csukuangfj in #2008
Add Java API for speech enhancement GTCRN models by @csukuangfj in #2009
Release v1.11.0 by @csukuangfj in #2010

New Contributors

@cjsdurj made their first contribution in #1962

Full Changelog: v1.10.46...v1.11.0

Contributors

csukuangfj, KarelVesely84, and 2 other contributors

Assets 74

10 Mar 03:23

csukuangfj

speech-enhancement-models

362ddf2

speech-enhancement-models

gtrcn_simple.onnx is from https://github.com/Xiaobin-Rong/gtcrn

speech_with_noise.wav is from https://modelscope.cn/models/iic/speech_zipenhancer_ans_multiloss_16k_base/file/view/master?fileName=examples%252Fspeech_with_noise.wav&status=0

Assets 6

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

What's Changed

New Contributors

Contributors

Uh oh!

What's Changed

Contributors

Uh oh!

What's Changed

New Contributors

Contributors

Uh oh!

Uh oh!

What's Changed

New Contributors

Contributors

Uh oh!

What's Changed

New Contributors

Contributors

Uh oh!

What's Changed

Contributors

Uh oh!

What's Changed

New Contributors

Contributors

Uh oh!

Uh oh!

Releases: k2-fsa/sherpa-onnx

source-separation-models

Uh oh!

v1.12.0

What's Changed

New Contributors

Contributors

Uh oh!

v1.11.5

What's Changed

Contributors

Uh oh!

v1.11.4

What's Changed

New Contributors

Contributors

Uh oh!

hr-files

Uh oh!

v1.11.3

What's Changed

New Contributors

Contributors

Uh oh!

v1.11.2

What's Changed

New Contributors

Contributors

Uh oh!

v1.11.1

What's Changed

Contributors

Uh oh!

v1.11.0

What's Changed

New Contributors

Contributors

Uh oh!

speech-enhancement-models

Uh oh!