StyleTTS2 Finetuning on LibriTTS

This project fine-tunes the StyleTTS2 model on the LibriTTS dataset to produce high-quality, expressive, and controllable speech synthesis. It supports multispeaker speech generation, style control, and can be extended for applications like voice cloning, TTS APIs, and conversational agents.

Features

Finetuning StyleTTS2 on LibriTTS with second-stage training
Multispeaker support
Style embedding via diffusion model
ASR and F0 integration
Accelerated with mixed precision (fp16)
Dockerized training pipeline
Checkpoint management with AWS S3

Requirements

AWS EC2 (with GPU, e.g., g4dn.xlarge)
NVIDIA Docker runtime (--gpus all)
Docker image built or pulled from ECR
AWS CLI configured with access to an S3 bucket
Checkpoints from base model (e.g., epochs_2nd_00020.pth)

Running with Docker

# Run the Docker container (replace with your own image name)
docker run --gpus all -d --name styletts2-container <your-docker-image>

# Access the container
docker exec -it styletts2-container bash

Training

# Customize Configs/config_ft.yml:

log_dir: "Models/LibriTTS"
epochs: 35
batch_size: 2
pretrained_model: "Models/LibriTTS/epochs_2nd_00020.pth"
load_only_params: true
...

Launch training:

accelerate launch --mixed_precision=fp16 train_finetune_accelerate.py --config_path ./Configs/config_ft.yml

Name		Name	Last commit message	Last commit date
Latest commit History 121 Commits
Colab		Colab
Configs		Configs
Data		Data
Demo		Demo
Models		Models
Modules		Modules
Utils		Utils
.gitignore		.gitignore
Dockerfile.api		Dockerfile.api
Dockerfile.train		Dockerfile.train
README.md		README.md
api.py		api.py
libri_inference.py		libri_inference.py
losses.py		losses.py
meldataset.py		meldataset.py
models.py		models.py
optimizers.py		optimizers.py
requirements.txt		requirements.txt
text_utils.py		text_utils.py
train_finetune.py		train_finetune.py
train_finetune_accelerate.py		train_finetune_accelerate.py
train_first.py		train_first.py
train_second.py		train_second.py
utils.py		utils.py
video_generator.py		video_generator.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

StyleTTS2 Finetuning on LibriTTS

Features

Requirements

Running with Docker

Training

Launch training:

About

Uh oh!

Releases

Packages

Languages

vivekchavan14/TTS-ft

Folders and files

Latest commit

History

Repository files navigation

StyleTTS2 Finetuning on LibriTTS

Features

Requirements

Running with Docker

Training

Launch training:

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages