Skip to content

lucky-bai/wasm-speech-streaming

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

WASM Streaming Speech Recognition

Real-time streaming speech-to-text transcription running entirely in your browser using Rust and WebAssembly (WASM). This demo processes audio entirely offline on your CPU after downloading a ~950MB speech recognition model.

Demo

demo-video.mp4

Try it live: https://huggingface.co/spaces/efficient-nlp/wasm-streaming-speech

Technologies

Related Projects

This is a research/tech demo. For more accurate cloud transcription and real-time LLM grammar correction, check out Voice Writer.

Performance

Performance varies by device.

  • On Apple Silicon or other recent CPUs, it typically runs in real time.
  • On older devices, it may not keep up (real-time factor < 1).
  • Mobile devices are not supported.

Prerequisites

  • Rust, Cargo, wasm32-unknown-unknown

    curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
    rustup target add wasm32-unknown-unknown
  • wasm-bindgen-cli

    cargo install wasm-bindgen-cli
  • wasm-opt (Binaryen) – optional but recommended

    • macOS: brew install binaryen
    • Ubuntu/Debian: sudo apt install binaryen
  • Python 3

  • curl

Running Locally

  1. Clone the repository:

    git clone https://github.com/lucky-bai/wasm-speech-streaming
    cd wasm-speech-streaming
  2. Build the Rust/WASM library:

    ./build-lib.sh
  3. Open your browser and go to:

    http://localhost:8000

License

MIT License