WASM Streaming Speech Recognition

Real-time streaming speech-to-text transcription running entirely in your browser using Rust and WebAssembly (WASM). This demo processes audio entirely offline on your CPU after downloading a ~950MB speech recognition model.

Demo

demo-video.mp4

Try it live: https://huggingface.co/spaces/efficient-nlp/wasm-streaming-speech

Technologies

Kyutai STT Model - 1B param streaming speech recognition model for English and French. This demo uses a 4-bit quantized version of the model.
Candle - Hugging Face's ML framework for Rust
Rayon - CPU parallelization for Rust
wasm-bindgen-rayon - WASM bindings for Rayon

Related Projects

This is a research/tech demo. For more accurate cloud transcription and real-time LLM grammar correction, check out Voice Writer.

Performance

Performance varies by device.

On Apple Silicon or other recent CPUs, it typically runs in real time.
On older devices, it may not keep up (real-time factor < 1).
Mobile devices are not supported.

Prerequisites

Rust, Cargo, wasm32-unknown-unknown

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
rustup target add wasm32-unknown-unknown

wasm-bindgen-cli
```
cargo install wasm-bindgen-cli
```
wasm-opt (Binaryen) – optional but recommended
- macOS: brew install binaryen
- Ubuntu/Debian: sudo apt install binaryen
Python 3
curl

Running Locally

Clone the repository:

git clone https://github.com/lucky-bai/wasm-speech-streaming
cd wasm-speech-streaming

Build the Rust/WASM library:
```
./build-lib.sh
```
Open your browser and go to:
```
http://localhost:8000
```

License

MIT License

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.cargo		.cargo
js		js
src		src
.gitignore		.gitignore
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md
build-lib.sh		build-lib.sh
index.html		index.html
rust-toolchain.toml		rust-toolchain.toml
server.py		server.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

WASM Streaming Speech Recognition

Demo

Technologies

Related Projects

Performance

Prerequisites

Running Locally

License

About

Uh oh!

Languages

License

lucky-bai/wasm-speech-streaming

Folders and files

Latest commit

History

Repository files navigation

WASM Streaming Speech Recognition

Demo

Technologies

Related Projects

Performance

Prerequisites

Running Locally

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Languages