YouTube Voice Cloner

Transform YouTube and TikTok videos into custom AI-generated speech using Dia TTS voice synthesis technology.

What does it do?

This application takes a YouTube or TikTok video URL and creates a new audio file with the same content but spoken in a different AI-generated voice. Here's how it works:

Download: Extracts audio from YouTube/TikTok videos
Analyze: Processes the original audio for voice characteristics
Synthesize: Generates new speech using Dia TTS voice cloning technology

Features

✅ Multi-platform support: YouTube and TikTok videos
✅ Voice cloning: Creates custom voice models from source audio
✅ Web interface: Simple, responsive UI built with HTMX
✅ Real-time processing: See progress as your audio is generated
✅ Audio player: Listen to results directly in the browser
✅ Docker deployment: Ready for cloud deployment (Google Cloud Run)

Requirements

System Dependencies

yt-dlp - Video/audio downloader

# macOS
brew install yt-dlp

# Ubuntu/Debian
sudo apt install yt-dlp

# Or via pip
pip install yt-dlp

TailwindCSS - For building styles
```
npm install -g tailwindcss
```

TEMPL - For templating in go

go install github.com/a-h/templ/cmd/templ@latest

Environment Variables

Create a .env file with:

PORT=8080
FAL_KEY=your_fal_ai_api_key_here
DATABASE_URL=your_postgres_connection_string

Quick Start

Clone the repository

git clone https://github.com/henrik392/youtube-voice-go.git
cd youtube-voice-go

Install dependencies
```
go mod download
```
Build and run
```
make build
make run
```
Open your browser Navigate to http://localhost:8080

Development Commands

# Build the application (generates templates + CSS + binary)
make build

# Run the application
make run

# Start with live reload (installs air if needed)
make watch

# Run tests
make test

# Start PostgreSQL database container
make docker-run

# Stop database container
make docker-down

# Clean build artifacts
make clean

Video Limitations

Length: 30 seconds to 10 minutes
Optimal: 1-5 minutes with clear audio
Format: Supports any format that yt-dlp can process

Architecture

cmd/
├── api/           # Main application entry point
└── web/           # Web handlers and templates
internal/
├── database/      # PostgreSQL integration
├── elevenlabs/    # Voice synthesis API client
├── server/        # HTTP server setup
└── youtube/       # Video processing logic

Technology Stack

Backend: Go with Chi router
Frontend: HTML templates (templ) + HTMX + TailwindCSS
Database: PostgreSQL
Audio Processing: yt-dlp + ffmpeg
AI Voice: Dia TTS (fal.ai) API

Deployment

Docker

make docker-build
docker run -p 8080:8080 yt-voice

Google Cloud Run

gcloud run deploy --image=europe-north1-docker.pkg.dev/youtube-to-voice/youtube-to-voice-repo/youtube-to-voice-image:tag1

How It Works

URL Validation: Checks if the provided URL is from YouTube or TikTok
Audio Extraction: Downloads and converts video to MP3 (max 3 minutes)
Reference Processing: Prepares the original audio as reference for voice cloning
Text Processing: Formats the target text for Dia TTS processing
Voice Synthesis: Uses Dia TTS to generate speech with the cloned voice in one step
Delivery: Serves the final audio file through the web interface

Contributing

Fork the repository
Create a feature branch
Make your changes
Run tests: make test
Submit a pull request

License

This project is for educational and personal use. Please respect content creators' rights and fal.ai's terms of service.

Name		Name	Last commit message	Last commit date
Latest commit History 63 Commits
.github/workflows		.github/workflows
cmd		cmd
internal		internal
.air.toml		.air.toml
.gitignore		.gitignore
.goreleaser.yml		.goreleaser.yml
Dockerfile		Dockerfile
Makefile		Makefile
README.md		README.md
docker-compose.yml		docker-compose.yml
go.mod		go.mod
go.sum		go.sum
nixpacks.toml		nixpacks.toml
package-lock.json		package-lock.json
package.json		package.json
tailwind.config.js		tailwind.config.js

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

YouTube Voice Cloner

What does it do?

Features

Requirements

System Dependencies

Environment Variables

Quick Start

Development Commands

Video Limitations

Architecture

Technology Stack

Deployment

Docker

Google Cloud Run

How It Works

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Languages

henrik392/youtube-voice-go

Folders and files

Latest commit

History

Repository files navigation

YouTube Voice Cloner

What does it do?

Features

Requirements

System Dependencies

Environment Variables

Quick Start

Development Commands

Video Limitations

Architecture

Technology Stack

Deployment

Docker

Google Cloud Run

How It Works

Contributing

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages