ToneSwap is a real-time voice conversion application that transforms a speaker's voice into a target voice while preserving the speech content, tone, and emotion. It supports various audio formats and provides a user-friendly web interface via Gradio.
- 🎤 Converts your voice to another person's voice using voice samples
- 🧠 Zero-shot voice conversion using pretrained models (FreeVC + WavLM + HiFi-GAN)
- 📁 Supports uploading or recording source and target audio in any format
- 🔁 Automatically converts input to mono 16kHz
.wav
using FFmpeg - 🌐 Easy-to-use Gradio web interface
Before running the app, download and place the following files and models in the correct directories:
Download FreeVC model checkpoints and place them in:
checkpoints/
Download the WavLM-Large model files from the official Microsoft repository and place them in:
wavlm/
Clone the HiFi-GAN repository and download generator_v1
pretrained model. Place it in:
hifigan/
- Clone this repo:
git clone https://github.com/<your-username>/ToneSwap.git cd ToneSwap
- Install Python dependencies:
pip install -r requirements.txt
- Install FFmpeg (for audio format conversion): Windows: FFmpeg Download
- Install Gradio:
pip install gradio
##✅ How to Run
Due to GitHub's file size limitations, this repository does not include pretrained models or checkpoints. You must download and place them manually as instructed above.