WhisperPlay

Real-time “Silent Disco” with Live Language Toggle for Any Stream

Inspiration

Late-night streamers often mute their mics to avoid waking family, making chat the only way to follow the action. Non-English viewers, meanwhile, rely on slow, post-stream captions. We wanted to give every viewer instant, private audio—in any language—without forcing creators to change their setup.

Architecture

What it does

WhisperPlay is a browser extension + local server combo that:

Captures a creator’s mic audio (or any tab's audio) locally.
Transmits it over an encrypted WebRTC data channel directly from an audio source to the viewer's browser.
Lets each viewer toggle between live AI-translated dubs in different languages.

All processing happens with < 1.5s latency, and the architecture ensures privacy and performance.

Getting Started

Prerequisites

Google Chrome
Python 3.9+
An environment that can run PyTorch (for the Whisper model)

API Keys

You will need API keys from the following services:

DeepL API (for translation)
ElevenLabs API (for text-to-speech)

Backend Setup

Clone the repository:

git clone <your-repo-url>
cd WhisperPlay/server

Install Python dependencies:
```
pip install -r requirements.txt
```

Set Environment Variables: Set your API keys in your terminal session before running the server. On Windows (PowerShell):

$env:DEEPL_API_KEY="your_deepl_key_here"
$env:ELEVEN_API_KEY="your_elevenlabs_key_here"

On macOS/Linux:

export DEEPL_API_KEY="your_deepl_key_here"
export ELEVEN_API_KEY="your_elevenlabs_key_here"

Frontend Setup

Open Google Chrome and navigate to chrome://extensions.
Enable “Developer mode” using the toggle in the top-right corner.
Click “Load unpacked”.
Select the WhisperPlay/extension folder from the project directory.

Usage

Start the Signaling Server: This server manages the initial WebRTC connection.
```
# In the /server directory
python signaling_server.py
```
Start the AI Server: This server handles transcription, translation, and speech synthesis.
```
# In a new terminal, in the /server directory
python server.py
```
Use the Extension:
- Open a browser tab with audio you want to translate (e.g., a YouTube video).
- Click the WhisperPlay icon in your browser's toolbar.
- Click "Connect". The status should change to "Connected."
- Select your desired language.
- The translated audio will begin playing automatically.

How We Built It

Component	Tech Stack
Signaling	Python `websockets`
Backend	Python, Flask, WebRTC
Browser Extension	Manifest V3, Web Audio API, WebRTC
Speech-to-Text	OpenAI Whisper `tiny` model
Translation	DeepL API
Text-to-Speech	ElevenLabs Streaming API
UI	HTML, CSS, JavaScript

Architecture Flow:

Audio Source Tab → WebRTC → Extension → AI Server (Transcribe/Translate/Synthesize) → Translated Audio → Headphones

Challenges We Ran Into

Sub-second TTS: ElevenLabs streaming was key to reducing round-trip latency.
WebRTC in Extensions: Service workers don’t natively support WebRTC, requiring a signaling server and careful state management between the popup, background, and content scripts.
API Rate-Limits: The current implementation is for demo purposes; a production version would need caching and graceful fallbacks.

What's Next for WhisperPlay

Client-Side Transcription: Move Whisper transcription into the browser using onnxruntime-web to reduce server load.
Multi-speaker Separation: Identify and separate game audio vs. creator voice.
Voice-Clone Consent Flow: Let streamers upload an audio sample for personalized TTS.
Offline Mode: Bundle a lightweight WASM-based TTS engine for when APIs are unreachable.

A real-time audio streaming and translation system with OBS plugin and browser extension.

Components

OBS Plugin (/obs-plugin)
- Audio capture and encryption
- WebRTC streaming
- Written in C++
Browser Extension (/extension)
- Audio reception and decryption
- Real-time translation
- Language toggle
- Written in JavaScript/TypeScript
Signaling Server (/server)
- WebRTC signaling
- Written in Node.js

Requirements

OBS Plugin

CMake 3.x
Visual Studio 2019+ (Windows)
OBS Studio SDK
libwebrtc
OpenSSL

Browser Extension

Node.js 18.x+
npm/yarn

Server

Node.js 18.x+
npm/yarn

Development Setup

Detailed setup instructions for each component coming soon.

Features

Real-time audio streaming via WebRTC
End-to-end encryption
Multi-language support with real-time translation
Low latency (<200ms for audio, <1s for translation)
Cross-platform compatibility

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.github		.github
.venv		.venv
.vscode		.vscode
extension		extension
obs-plugin		obs-plugin
server		server
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

WhisperPlay

Inspiration

Architecture

What it does

Getting Started

Prerequisites

API Keys

Backend Setup

Frontend Setup

Usage

How We Built It

Challenges We Ran Into

What's Next for WhisperPlay

Components

Requirements

OBS Plugin

Browser Extension

Server

Development Setup

Features

About

Uh oh!

Releases

Packages

Languages

License

LSUDOKO/Whisper_Play

Folders and files

Latest commit

History

Repository files navigation

WhisperPlay

Inspiration

Architecture

What it does

Getting Started

Prerequisites

API Keys

Backend Setup

Frontend Setup

Usage

How We Built It

Challenges We Ran Into

What's Next for WhisperPlay

Components

Requirements

OBS Plugin

Browser Extension

Server

Development Setup

Features

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages