Whispering Tiger (Live Translate/Transcribe)

Whispering Tiger is a free and Open-Source tool that can listen/watch to any audio stream or in-game image on your machine and prints out the transcription or translation to a web browser using Websockets or over OSC (examples are Streaming-overlays or VRChat).

Content:

Features

Runs 100% locally on your machine. (Once A.I. Models are downloaded, no further internet connection is required)
Speech recognition, translation and transcription
- OpenAI's Whisper, Supports ~98 languages
- Meta's Seamless M4T, multi modal, Supports ~101 languages
- Microsoft's Speech T5, English only
- Microsoft's Phi-4 Multimodal LLM, Supports ~23 languages
- NVIDIA's NeMo Canary, English, Spanish, German, and French
- Wav2Vec Bert 2.0, English and German
Text translation
- LID [Language Identification] (Supports 200 languages)
- NLLB-200 (single model, Supports 200 languages, high accuracy)
- M2M-100 (single model, Supports 100 languages, high accuracy)
- Seamless M4T (single model, multi modal, Supports ~101 languages)
- Microsoft's Phi-4 Multimodal LLM (single model, Supports ~23 languages)
OCR [Optical Character Recognition] (to capture game images and translate in-game text)
- EasyOCR (Supports 80+ languages)
- Microsoft's Phi-4 Multimodal LLM (Supports ~23 languages, supports handwriting)
- GOT-OCR 2.0 (supports handwriting)
TTS [Text-to-Speech] (Read out transcriptions/translations)
- Silero
- F5/E2-TTS (Supports Voice Cloning + Streamed playback)
- Kokoro TTS (Supports streamed playback)
- Zonos TTS (Supports Voice Cloning + Streamed playback)
VAD [Voice Activity Detection]
- Silero-VAD
RVC [Retrieval-based Voice Conversion] (Convert your voice, the voice in audio files or from Text-to-Speech)
- RVC (Using the RVC Plugin)
LLM [Large language model] (Continuation of text. automatic answer generation etc.) Proof of concept
- Microsoft's Phi-4 Multimodal LLM (Supports Question answering and extendable Function Calling)
- Via Whispering Tiger Plugins:
  - FLAN-T5, GPT-J, Bloomz etc.
And more using other Plugins...

See all available Plugins in the List of Plugins.

Quickstart (Recommended)

For a quick and easy start, download the latest Whispering Tiger UI from here: https://github.com/Sharrnah/whispering-ui

This is a native UI application that allows keeping your Whispering Tiger version up-to-date and manage the settings more easily.

Release Downloads

Standalone Releases with all dependencies included.

Go to the GitHub Releases Page and Download from the download Link in the description or find the Latest Release here.

(because of the 2 GB Limit, no direct release files on GitHub)

Install CUDA for GPU Acceleration (recommended)
Extract the Files on a Drive with enough free Space.
- (After download of medium Whisper Model + medium NLLB-200 Translation model, it can take up to 20 GB)
Run only using the *.bat files. Edit or copy an existing start-*.bat file and edit the parameters in any text editor for your own command-line flags.
- start-transcribe-mic.bat tries to use your default microphone and is a good starting point.

Sources

A thanks goes to

OpenAI https://github.com/openai/whisper
Awexander https://github.com/Awexander/audioWhisper
Blake https://github.com/mallorbc/whisper_mic
Meta (LID, NLLB-200, M2M-100) https://ai.facebook.com/blog/nllb-200-high-quality-machine-translation/
Meta (Seamless M4T) https://github.com/facebookresearch/seamless_communication
faster-whisper https://github.com/guillaumekln/faster-whisper
EasyOCR https://github.com/jaidedai/easyocr
Silero (TTS, VAD) https://github.com/snakers4/silero-models

Name		Name	Last commit message	Last commit date
Latest commit History 625 Commits
.github		.github
Models		Models
Plugins		Plugins
Profiles		Profiles
builder		builder
dist_files		dist_files
documentation		documentation
images		images
markers		markers
websocket_clients		websocket_clients
.drone.yml		.drone.yml
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
Utilities.py		Utilities.py
VRC_OSCLib.py		VRC_OSCLib.py
VRC_OSCServer.py		VRC_OSCServer.py
app-icon.ico		app-icon.ico
audioWhisper.py		audioWhisper.py
audioWhisper.spec		audioWhisper.spec
audio_processing_recording.py		audio_processing_recording.py
audio_tools.py		audio_tools.py
audioprocessor.py		audioprocessor.py
build-standalone.bat		build-standalone.bat
downloader.py		downloader.py
get-device-list.bat		get-device-list.bat
ignorelist.txt		ignorelist.txt
install.bat		install.bat
processmanager.py		processmanager.py
remote_opener.py		remote_opener.py
requirements.amd.txt		requirements.amd.txt
requirements.linux.prereq.txt		requirements.linux.prereq.txt
requirements.linux.txt		requirements.linux.txt
requirements.nvidia.txt		requirements.nvidia.txt
requirements.txt		requirements.txt
settings.py		settings.py
speech_recognition_patch.py		speech_recognition_patch.py
start-transcribe-mic.bat		start-transcribe-mic.bat
start-translate-pcsound.bat		start-translate-pcsound.bat
websocket.py		websocket.py
windowcapture.py		windowcapture.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Repository files navigation

Whispering Tiger (Live Translate/Transcribe)

Content:

Features

Quickstart (Recommended)

Release Downloads

Sources

About

Uh oh!

Releases 63

Sponsor this project

Uh oh!

Packages

Uh oh!

Contributors 2

Languages

Uh oh!

License

Sharrnah/whispering

Folders and files

Latest commit

History

Repository files navigation

Whispering Tiger (Live Translate/Transcribe)

Content:

Features

Quickstart (Recommended)

Release Downloads

Sources

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 63

Sponsor this project

Uh oh!

Packages 0

Uh oh!

Contributors 2

Languages

Packages