Skip to content

A Python-based web application that converts YouTube videos into text transcripts and generates concise summaries. Built with Streamlit and powered by OpenAI's Whisper for transcription and Hugging Face's transformers for summarization, this tool helps users quickly understand video content without watching the entire video.

Notifications You must be signed in to change notification settings

rajeshai/Youtube-Transcription

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 

Repository files navigation

YouTube Video Transcriber and Summarizer

Python Streamlit OpenAI Whisper

A Streamlit web application that transcribes YouTube videos and generates summaries using OpenAI's Whisper model and Hugging Face's transformers. The app provides dual methods for transcription: direct audio processing with Whisper and youtube-transcript-api.

📋 Features

  • Dual Transcription Methods:
    • Primary: OpenAI's Whisper model for audio transcription
    • Fallback: youtube-transcript-api
  • Text Summarization: Generate concise summaries using Hugging Face transformers
  • Clean Interface: User-friendly UI built with Streamlit
  • Error Handling: Automatic fallback system if primary method fails
  • Progress Tracking: Real-time status updates during processing

🚀 Quick Start

  1. Clone the repository:
git clone https://github.com/rajeshai/Youtube-Transcription.git
cd Youtube-Transcription
  1. Create and activate a virtual environment (recommended):
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
  1. Install required packages:
pip install -r requirements.txt
  1. Run the app:
streamlit run app.py
  1. Open your browser and go to http://localhost:8501

📦 Requirements

Create a requirements.txt file with these dependencies:

streamlit
openai-whisper
pytubefix
transformers
torch
youtube_transcript_api

💻 Usage

  1. Launch the application
  2. Enter a YouTube URL in either the Transcription or Summary tab
  3. Click the respective button to get either:
    • Full video transcription
    • Summarized content

🔧 How It Works

  1. Transcription Process:

    • First attempts to download audio using pytubefix and transcribe with Whisper
    • If that fails, automatically switches to youtube-transcript-api
    • Shows clear status messages throughout the process
  2. Summarization Process:

    • Processes transcribed text using Hugging Face's summarization pipeline
    • Handles long transcripts by chunking text into manageable segments
    • Combines summaries for a coherent final output

⚠️ Known Limitations

  • YouTube may occasionally block pytubefix and youtube-transcript-api requests
  • Some videos might not have available transcripts through YouTube's API
  • Processing long videos may take additional time
  • Summarization quality depends on transcript accuracy

🤝 Contributing

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/AmazingFeature)
  3. Commit your changes (git commit -m 'Add some AmazingFeature')
  4. Push to the branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

🙏 Acknowledgments


Note: Please ensure you comply with YouTube's terms of service when using this application.

About

A Python-based web application that converts YouTube videos into text transcripts and generates concise summaries. Built with Streamlit and powered by OpenAI's Whisper for transcription and Hugging Face's transformers for summarization, this tool helps users quickly understand video content without watching the entire video.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages