🗣️ AI Voice Cloner (XTTS v2)

A Streamlit web app for cloning voices using Coqui TTS.
Supports uploading, recording, or selecting reference voices and generating natural-sounding speech in multiple languages.

🚀 Features

🎤 Record voice directly in the browser
📤 Upload reference voice samples (WAV, MP3, M4A, OGG, FLAC)
🎵 Save and reuse voices from a gallery
🌎 Multilingual support (English, Spanish, French, German, Chinese, Japanese, etc.)
🎭 Control speech style, speed, and emotion
📂 Keep track of recent generations with playback & downloads
📥 Export audio as WAV or MP3

🛠 Tech Stack

Frontend / UI

Streamlit: Web interface for interaction and real-time updates
streamlit-option-menu: Sidebar navigation with a clean UI

Backend / Logic

Coqui TTS (XTTS v2): Voice cloning & text-to-speech model
pydub: Audio file conversion (WebM/MP3 ↔ WAV)
librosa & soundfile: Audio signal processing
NumPy: Array and numerical operations

Utilities

requests: API calls (if extending functionality)
python-dotenv: Manage secrets/API keys
ffmpeg (system dependency): Required for audio encoding/decoding

Environment

requests: API calls (if extending functionality)
Virtual Environment (venv/) for dependency isolation

📂 Project Structure

voice_clone_app/
│
├── outputs/                  # Generated outputs (auto-created)
│   └── xtts_*.wav            # Generated files
│
├── voices/                   # Reference voices (auto-created)
│   ├── ref_*.wav             # Uploaded/recorded samples
│
├── app.py                    # Main Streamlit application
├── run_app.py                # Launcher (cross-platform, auto-opens browser)
├── test.py                   # Experimental/testing script
├── requirements.txt          # Python dependencies
├── LICENSE                   # Open-source license
├── .gitignore                # Ignored files/folders
└── README.md                 # Project documentation

⚙️ Installation

1. Clone the Repository

git clone https://github.com/EbrahimAR/AI-Voice-Cloner-XTTS-v2.git
cd AI-Voice-Cloner-XTTS-v2

2. Create virtual environment

python -m venv .venv
source .venv/bin/activate    # On Linux/Mac
.venv\Scripts\activate       # On Windows

3. Install dependencies

pip install -r requirements.txt

4. Install system dependencies

ffmpeg is required for MP3 conversion:
- Windows: choco install ffmpeg
- macOS: brew install ffmpeg
- Linux: sudo apt-get install ffmpeg

▶️ Run the app

streamlit run app.py

Or use the helper script:

python run_app.py

App will open at: http://localhost:8502

👨‍💻 Author

Ebrahim Abdul Raoof

GitHub

📜 License

This project is licensed under the MIT License. See LICENSE for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🗣️ AI Voice Cloner (XTTS v2)

🚀 Features

🛠 Tech Stack

Frontend / UI

Backend / Logic

Utilities

Environment

📂 Project Structure

⚙️ Installation

1. Clone the Repository

2. Create virtual environment

3. Install dependencies

4. Install system dependencies

▶️ Run the app

👨‍💻 Author

📜 License

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
outputs		outputs
voices		voices
.gitignore		.gitignore
LICENSE		LICENSE
README.MD		README.MD
app.py		app.py
requirements.txt		requirements.txt
run_app.py		run_app.py

License

EbrahimAR/AI-Voice-Cloner-XTTS-v2

Folders and files

Latest commit

History

Repository files navigation

🗣️ AI Voice Cloner (XTTS v2)

🚀 Features

🛠 Tech Stack

Frontend / UI

Backend / Logic

Utilities

Environment

📂 Project Structure

⚙️ Installation

1. Clone the Repository

2. Create virtual environment

3. Install dependencies

4. Install system dependencies

▶️ Run the app

👨‍💻 Author

📜 License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages