Skip to content

csaicharan/yutrans

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

5 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

YUTRAS - YouTube Translation System

YUTRAS Logo

YUTRAS is a modern, AI-powered speech-to-speech translation system that allows you to automatically download audio from YouTube videos and translate the spoken content to another language. It leverages Facebook's Seamless M4T model for high-quality speech-to-speech translation.

πŸš€ Features

  • YouTube Audio Extraction: Download audio tracks from any YouTube video
  • Speech-to-Speech Translation: Translate spoken content between languages using state-of-the-art AI
  • Smart Chunking: Process long audio files in chunks to prevent memory issues
  • GPU Acceleration: CUDA support for faster processing
  • Robust Error Handling: Automatic retries and fallbacks to ensure successful translation
  • User-friendly CLI: Simple command-line interface with helpful options

πŸ“‹ Requirements

  • Python 3.8 or higher
  • FFmpeg (required for audio processing)
  • NVIDIA GPU with CUDA support (optional, for faster processing)

πŸ”§ Installation

Automatic Installation (Recommended)

# Simply run the installation script
python install.py

This script will:

  • Install the package in development mode
  • Check all required dependencies
  • Verify FFmpeg installation
  • Provide clear instructions for any missing requirements

Manual Installation Options

Option 1: Install from source

# Clone the repository
git clone https://github.com/yourusername/yutras.git
cd yutras

# Create and activate a virtual environment (recommended)
python -m venv venv
# On Windows:
venv\Scripts\activate
# On Linux/Mac:
source venv/bin/activate

# Install the package
pip install -e .

Option 2: Install dependencies directly

# Create and activate a virtual environment (recommended)
python -m venv venv
# On Windows:
venv\Scripts\activate
# On Linux/Mac:
source venv/bin/activate

# Install dependencies
pip install -r requirements.txt

FFmpeg Installation

YUTRAS requires FFmpeg for audio processing. If you don't have it installed:

Windows:

  1. Download FFmpeg from ffmpeg.org (e.g., from gyan.dev)
  2. Extract the downloaded archive (e.g., to C:\ffmpeg)
  3. Add the bin directory (e.g., C:\ffmpeg\bin) to your system's PATH environment variable
  4. Restart your terminal/command prompt

Linux:

sudo apt update
sudo apt install ffmpeg

macOS:

brew install ffmpeg

πŸ“Š Usage

Basic Usage

# Using the installed package
yutras "https://youtu.be/example"

# Or using the script directly
python translate.py "https://youtu.be/example"

Test Translation Without YouTube

To test just the translation functionality with a local audio file:

# Basic usage
python test_translation.py input_audio.wav output_audio.wav

# With options
python test_translation.py input_audio.wav output_audio.wav --lang fra --cpu --chunk-size 5

This is useful for debugging translation issues or working with local audio files.

Advanced Options

# Translate to a different language
yutras "https://youtu.be/example" --target_lang fra  # French

# Force CPU usage (more reliable but slower)
yutras "https://youtu.be/example" --cpu

# Specify output directory
yutras "https://youtu.be/example" -o my_translations

# Only download audio, skip translation
yutras "https://youtu.be/example" --skip_translation

# Adjust chunk size for processing long videos
yutras "https://youtu.be/example" --chunk_size 10

Checking CUDA Availability

To check if your system supports CUDA acceleration:

python check_cuda.py

🌐 Supported Languages

YUTRAS supports all languages available in the Seamless M4T model. Some common language codes:

  • eng: English
  • deu: German
  • fra: French
  • spa: Spanish
  • rus: Russian
  • cmn: Mandarin Chinese
  • jpn: Japanese

For a complete list, refer to the Seamless M4T documentation.

πŸ› οΈ Project Structure

yutras/
β”œβ”€β”€ yutras/
β”‚   β”œβ”€β”€ __init__.py          # Package initialization
β”‚   β”œβ”€β”€ cli.py               # Command-line interface
β”‚   β”œβ”€β”€ config.py            # Configuration management
β”‚   β”œβ”€β”€ core/                # Core functionality
β”‚   β”‚   β”œβ”€β”€ __init__.py
β”‚   β”‚   └── pipeline.py      # Translation pipeline
β”‚   β”œβ”€β”€ models/              # Model management
β”‚   β”‚   β”œβ”€β”€ __init__.py
β”‚   β”‚   └── seamless_m4t.py  # SeamlessM4T model wrapper
β”‚   └── utils/               # Utility functions
β”‚       β”œβ”€β”€ __init__.py
β”‚       β”œβ”€β”€ audio.py         # Audio processing utilities
β”‚       β”œβ”€β”€ download.py      # YouTube download utilities
β”‚       └── system.py        # System utilities
β”œβ”€β”€ translate.py             # Main script for YouTube to translated audio
β”œβ”€β”€ test_translation.py      # Test script for direct audio translation
β”œβ”€β”€ install.py               # Installation and environment validation
β”œβ”€β”€ check_cuda.py            # CUDA availability checker
β”œβ”€β”€ setup.py                 # Package setup script
β”œβ”€β”€ requirements.txt         # Package dependencies
└── README.md                # This documentation

⚠️ Troubleshooting

Common Issues and Solutions

Translation not working

If translation is failing:

  1. Run the installation script first: python install.py
  2. Test with a local audio file: python test_translation.py input.wav output.wav --cpu
  3. Check logs for specific error messages
  4. Make sure your input audio file actually contains speech

Model loading errors

If you see errors during model loading:

  • Ensure you have a stable internet connection (model is downloaded from Hugging Face)
  • Try using --cpu flag to rule out GPU-related issues
  • Consider clearing Hugging Face cache: rm -rf ~/.cache/huggingface/hub (Linux/Mac) or delete the folder on Windows

"Out of memory" errors

  • Try using smaller chunks with --chunk_size 5
  • Use the --cpu flag to process on CPU (slower but more reliable)
  • Close other GPU-intensive applications

Audio download issues

  • Ensure FFmpeg is properly installed and in your PATH
  • Check your internet connection
  • Verify the YouTube URL is valid and accessible
  • Try downloading the audio manually with yt-dlp first

Slow processing

  • Enable GPU acceleration by installing CUDA (see "Installation")
  • If using CPU, be patient as translation can take significant time
  • Reduce chunk size for more reliable (but potentially slower) processing

Audio shape/format errors

  • Ensure your audio input is in a standard format (WAV is safest)
  • For best results, use mono audio at 16kHz

🀝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

πŸ“œ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ™ Acknowledgments

About

Youtube video translation using SPST (outputs only Audio)

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages