AudioProcess is a powerful YouTube audio processing tool that can download audio from YouTube videos, extract subtitles, transcribe content, and generate summaries.
- YouTube Audio Downloading: Download audio from YouTube videos in WebM format
- Subtitle Extraction: Extract subtitles from YouTube videos when available
- Audio Transcription: Transcribe audio using Alibaba Cloud's speech recognition service
- Text Summarization: Generate summaries of subtitle/transcription content using large language models
- Telegram Bot Integration: Two Telegram bots for convenient audio downloading and content summarization
- Python 3.6+
- Required dependencies (see below)
Main dependencies include:
- yt-dlp (YouTube downloader)
- oss2 (Alibaba Cloud OSS)
- dashscope (Alibaba Cloud AI services)
- python-telegram-bot (Telegram bot integration)
- openai (API for summarization)
- httpx (with SOCKS proxy support)
- Clone the repository
- Install the required dependencies:
pip install -r requirements.txt
Before using the application, you need to set up your configuration:
- Configure API keys for cloud services in
audioprocess/config/settings.py
- Set up your Telegram bot tokens (if using the Telegram bot features)
- Configure proxy settings if needed
Use the start.sh
script to start both bots (Audio Download Bot and Text Summary Bot):
./start.sh
This will launch:
- Audio Download Bot: Downloads audio from YouTube videos when given a URL
- Text Summary Bot: Extracts subtitles or transcribes audio from YouTube videos and generates summaries
You can also use the core functionality directly:
from audioprocess.main import process_youtube_video
# Process a YouTube video (extract subtitles/transcribe and summarize)
result = process_youtube_video("https://www.youtube.com/watch?v=VIDEO_ID")
- Accepts YouTube URLs
- Downloads audio in the best available quality
- Sends the audio file back to the user via Telegram
- Accepts YouTube URLs
- Extracts subtitles if available or downloads and transcribes the audio
- Generates and sends back a summary of the content
audioprocess/core/
: Core functionality modulesyoutube_downloader.py
: Audio downloading from YouTubesubtitle_extractor.py
: YouTube subtitle extractiontranscription.py
: Audio transcriptionsummarization.py
: Text summarizationoss_uploader.py
: File uploading to Alibaba Cloud OSS
audioprocess/scripts/
: Bot and utility scriptsstart_audio_bot.py
: Audio Download Bot scriptstart_summary_bot.py
: Text Summary Bot script
audioprocess/utils/
: Utility functionsaudioprocess/config/
: Configuration files
- The system can use system-defined proxies or a default proxy if needed
- For SOCKS proxies, ensure
httpx[socks]
is installed
- Verify your bot tokens are correct
- Ensure your user ID is in the allowed users list if access is restricted
This project is for personal use.
Developed by CC.