
Your AI-powered Leadership Development Companion
A comprehensive system for creating an intelligent leadership coach powered by YouTube knowledge and web search capabilities
Leadership Coach AI is a sophisticated AI system that creates a searchable knowledge base from leadership-focused YouTube videos. It uses advanced natural language processing to provide personalized leadership coaching, advice, and training based on curated content.
The system combines audio-based transcription, semantic search capabilities, and intelligent response generation to deliver an interactive coaching experience with referenced answers. All components are optimized for robustness and multilingual support with special focus on Turkish language processing.
- π§ Specialized Knowledge Base: Creates an intelligent knowledge base from leadership-focused YouTube videos
- π Semantic Search: Uses vector embeddings for accurate semantic retrieval of relevant content
- π Web Search Integration: Supplements knowledge base with current web information when needed
- π£οΈ Audio-based Transcription: Downloads and transcribes any YouTube video using Whisper
- π€ Grammar Enhancement: Improves transcript quality using OpenAI language models
- π― Referenced Responses: All responses include sources and citations for verification
- π Context-Aware Answers: Intelligently combines information from multiple sources
- π Voice Output: Converts text responses to natural-sounding speech
- π‘οΈ Robust Architecture: Comprehensive error handling and fallback mechanisms
- π Detailed Logging: Complete logging system for monitoring and debugging
- βοΈ Customization: Adjustable parameters for search depth, response generation, and more
- π Language Optimization: Special focus on Turkish language with proper character support
The knowledge base creation process starts with YouTube videos and ends with searchable vector embeddings:
βββββββββββββββββββββ βββββββββββββββββββββ βββββββββββββββββββββ
β β β β β β
β YouTube β β yt-dlp β β Audio Files β
β Videos/Playlist ββββββΊβ Downloader ββββββΊβ (.mp3) β
β β β β β β
βββββββββββββββββββββ βββββββββββββββββββββ βββββββββββ¬ββββββββββ
β
βΌ
βββββββββββββββββββββ βββββββββββββββββββββ βββββββββββββββββββββ
β β β β β β
β Raw JSON β β Whisper Model β β FFmpeg β
β Transcripts βββββββ€ Transcription βββββββ€ Audio β
β β β β β Processing β
βββββββββββ¬ββββββββββ βββββββββββββββββββββ βββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββ βββββββββββββββββββββ βββββββββββββββββββββ
β β β β β β
β Transcript β β OpenAI LLM β β Text Processing β
β Chunks ββββββΊβ Grammar ββββββΊβ & Cleaning β
β β β Improvement β β β
βββββββββββ¬ββββββββββ βββββββββββββββββββββ βββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββ βββββββββββββββββββββ βββββββββββββββββββββ
β β β β β β
β Improved β β OpenAI β β Vector β
β Chunks JSON ββββββΊβ Embedding ββββββΊβ Store β
β β β API β β Database β
βββββββββββββββββββββ βββββββββββββββββββββ βββββββββββββββββββββ
Key Components:
-
YouTube Extraction:
YouTubeExtractor
class processes YouTube videos/playlists:- Uses
yt-dlp
to download audio and extract metadata - Processes videos in batches with parallel processing
- Saves audio files to
data/audio
directory
- Uses
-
Audio Transcription: Whisper-based transcription:
- Transcribes audio using OpenAI's Whisper model
- Creates timestamp-aligned transcript segments
- Divides content into logical chunks
-
Text Enhancement:
ChunkProcessor
improves transcript quality:- Corrects grammar and formatting issues using LLMs
- Optimizes Turkish language content
- Preserves original source references and timestamps
-
Vector Embedding:
VectorStore
creates searchable embeddings:- Uses OpenAI's text-embedding-3-small model for embedding
- Creates efficient vector representations
- Stores metadata for source attribution
The application flow shows how user queries are processed and answered:
βββββββββββββββββββββ βββββββββββββββββββββ βββββββββββββββββββββ
β β β β β β
β User Query ββββββΊβ Query βββ¬βββΊβ Vector Store β
β (Streamlit UI) β β Processor β β β Search β
β β β β β β β
βββββββββββββββββββββ βββββββββββββββββββββ β βββββββββββ¬ββββββββββ
β β
β βΌ
β βββββββββββββββββββββ
β β β
β β Knowledge Base β
β β Results β
β β β
β βββββββββββ¬ββββββββββ
β β
βββββββββββββββββββββ β β
β β β β
β Result Quality β β β
β Check βββββββββββββββββ€
β β β β
βββββββββββ¬ββββββββββ β β
β β β
βΌ β β
βββββββββββββββββββββ βββββββββββββββββββββ β β
β β β β β β
β Web Search βββββββ€ Need More βββ β
β (If Needed) β β Information? β β
β β β β β
βββββββββββ¬ββββββββββ βββββββββββ¬ββββββββββ β
β β β
βΌ βΌ β
βββββββββββββββββββββ βββββββββββββββββββββ β
β β β β β
β Web Results ββββββΊβ Context βββββββββββββββββ
β (Optional) β β Integration β
β β β β
βββββββββββββββββββββ βββββββββββ¬ββββββββββ
β
βΌ
βββββββββββββββββββββ βββββββββββββββββββββ βββββββββββββββββββββ
β β β β β β
β OpenAI βββββββ€ Response ββββββΊβ Source β
β GPT-4o-mini β β Generation β β Attribution β
β β β β β β
βββββββββββ¬ββββββββββ βββββββββββββββββββββ βββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββ βββββββββββββββββββββ βββββββββββββββββββββ
β β β β β β
β Response ββββββΊβ Text-to-Speech ββββββΊβ Final Response β
β Text β β (Optional) β β with Audio β
β β β β β β
βββββββββββββββββββββ βββββββββββββββββββββ βββββββββββββββββββββ
Key Components:
-
User Interface: Built with Streamlit:
- Chat-based interface for natural interactions
- Dynamic settings for customization
- Real-time knowledge base status monitoring
-
Query Processing:
QueryProcessor
coordinates the retrieval and response:- Analyzes query intent and context
- Searches knowledge base with semantic matching
- Determines if web search is needed based on result quality
-
Multi-Source Retrieval: Combines information from multiple sources:
- Knowledge base vectors provide curated content
- Web search supplements with up-to-date information
- Results are ranked by relevance and reliability
-
Response Generation:
OpenAIService
creates coherent responses:- Uses GPT-4o-mini for natural language generation
- Integrates context from multiple sources
- Includes source attribution for transparency
- Optimized prompting for Turkish language responses
-
Voice Output:
TextToSpeech
converts text to audio:- Processes responses with proper pronunciation
- Handles long-form content by chunking
- Creates embedded audio players in the UI
The system uses a modular architecture where components interact through well-defined interfaces:
Component | Purpose | Implementation |
---|---|---|
YouTubeExtractor | Downloads and transcribes content | Uses yt-dlp and Whisper |
ChunkProcessor | Improves transcript quality | LLM-based grammar enhancement |
VectorStore | Enables semantic search | OpenAI embeddings with cosine similarity |
OpenAIClient | Centralizes API access | Robust error handling and retries |
QueryProcessor | Coordinates response generation | Knowledge retrieval and ranking |
WebSearch | Supplements knowledge base | Multi-engine search with fallbacks |
TextToSpeech | Provides voice output | gTTS with audio processing |
- Python 3.11+
- FFmpeg (for audio processing)
- OpenAI API key
- YouTube videos/playlist with leadership content
-
Clone the repository:
git clone https://github.com/Daymenion/leadership-coach-ai.git cd LeadershipCoach
-
Install Python dependencies:
pip install -r requirements.txt
-
Install FFmpeg:
- Windows: Download from FFmpeg.org and add to PATH
- macOS:
brew install ffmpeg
- Linux:
sudo apt install ffmpeg
-
Create a
.env
file with your API keys:OPENAI_API_KEY=your_openai_api_key_here GOOGLE_SEARCH_API_KEY=your_google_api_key_here # Optional GOOGLE_SEARCH_CX=your_google_custom_search_id_here # Optional BING_SEARCH_API_KEY=your_bing_api_key_here # Optional
Use init_knowledge_base.py
to create the knowledge base from YouTube videos:
# From a YouTube playlist
python init_knowledge_base.py --playlist "YOUR_PLAYLIST_ID"
# From specific videos
python init_knowledge_base.py --videos XXXXXXXXXXX YYYYYYYYYYY ZZZZZZZZZZZ
# Configure processing options
python init_knowledge_base.py --playlist "YOUR_PLAYLIST_ID" --max-videos 10
#Skip transcript extraction if already available
python init_knowledge_base.py --playlist "YOUR_PLAYLIST_ID" --skip-transcription
#Skip grammar correction for faster processing but lower quality
python init_knowledge_base.py --playlist "YOUR_PLAYLIST_ID" --skip-grammar
# Change the logging level
python init_knowledge_base.py --playlist "YOUR_PLAYLIST_ID" --log-level INFO
-
Start the Streamlit application:
streamlit run app.py
-
Open the provided URL in your browser (typically http://localhost:8501)
-
If running for the first time without vector embeddings:
- Click "Advanced Settings" in the sidebar
- Click "Rebuild Knowledge Base" and wait for initialization
-
Start asking leadership questions!
The system is designed to answer questions about:
- Leadership principles and techniques
- Team management strategies
- Professional development
- Organizational behavior
- Change management
- Communication skills
Example questions:
- "Etkili bir lider nasΔ±l olunur?"
- "Ekip motivasyonunu artΔ±rmanΔ±n yollarΔ± nelerdir?"
- "Δ°Ε yerinde Γ§atΔ±Εma yΓΆnetimi iΓ§in ΓΆneriler verebilir misin?"
- "DeΔiΕim yΓΆnetimi sΓΌrecinde nelere dikkat edilmeli?"
The sidebar provides several options to customize your experience:
- Knowledge Base Results: Control how many results to retrieve from the knowledge base
- Web Search: Enable/disable web search capabilities
- Voice Output: Toggle text-to-speech functionality
- Temperature: Adjust creativity in responses (lower = more deterministic)
- Max Tokens: Set maximum length for responses
Each response includes:
- A detailed answer to your question
- Source references (YouTube videos with timestamps and/or web pages)
- Optional voice output (if enabled)
The project includes comprehensive test scripts:
# Test overall system components
python test_system.py
# Test OpenAI API connectivity
python tests/test_api_connectivity.py
# Test grammar correction
python tests/test_grammar_correction.py --samples
# Test audio transcription
python tests/test_transcript_extraction.py --check-system
/
βββ app.py # Main Streamlit application
βββ init_knowledge_base.py # Knowledge base initialization script
βββ test_system.py # System testing script
βββ requirements.txt # Python dependencies
βββ .env.sample # Example environment variables
βββ logs/ # Log files
βββ tests/ # Test scripts
β βββ test_api_connectivity.py # API connectivity tests
β βββ test_grammar_correction.py # Grammar correction tests
β βββ test_transcript_extraction.py # Transcription tests
β βββ ... # Additional test scripts
βββ src/
β βββ ai_engine/ # AI response generation components
β β βββ openai_service.py # OpenAI integration
β β βββ query_processor.py # Query processing
β β βββ web_search.py # Web search integration
β βββ knowledge_base/ # Knowledge base components
β β βββ chunk_processor.py # Grammar improvement and chunk processing
β β βββ vector_store.py # Vector embeddings and search
β β βββ youtube_extractor.py # YouTube playlist/video processing
β β
β βββ audio/ # Audio processing
β β βββ text_to_speech.py # TTS functionality
β βββ utils/ # Utilities
β βββ helpers.py # Helper functions
β βββ openai_client.py # Centralized OpenAI client
β
βββ data/ # Data storage (created during initialization)
βββ audio/ # Downloaded audio files
βββ chunks/ # Processed transcript chunks
βββ vector_store/ # Vector embeddings
βββ logs/ # Conversation logs
- Update yt-dlp:
pip install -U yt-dlp
- Check video availability in your region
- Verify internet connection and YouTube access
- Verify FFmpeg installation
- Check audio file quality
- Try different videos or smaller segments
- Verify API key is correct and has sufficient credits
- Check internet connectivity
- Run
python tests/test_api_connectivity.py
to diagnose
- Ensure sufficient disk space
- Check file permissions in data directories
- Try rebuilding knowledge base from sidebar menu
- Enable voice output in settings
- Ensure browser allows audio playback
- Verify required audio libraries are installed
To focus on specific leadership topics:
- Select YouTube videos that focus on your areas of interest
- Process them using
init_knowledge_base.py
- Use the web search feature to supplement with up-to-date information
The modular architecture allows for straightforward integration:
- Import
QueryProcessor
fromsrc.ai_engine.query_processor
for NLP capabilities - Use
VectorStore
fromsrc.knowledge_base.vector_store
for semantic search - Leverage
OpenAIClient
fromsrc.utils.openai_client
for API access
For larger knowledge bases:
- Use the
--skip-grammar
flag during initialization for faster processing - Adjust the number of knowledge base results in settings to optimize response time
- Consider using a more powerful embedding model for increased search accuracy
This project is licensed under the MIT License - see the LICENSE file for details.
- OpenAI for providing Whisper and GPT models
- yt-dlp for YouTube downloading capabilities
- Streamlit for the user interface framework
- gTTS for text-to-speech functionality
- All contributors who helped improve this project
This project open to adding new features and improvements. Please check the repository for updates and submit issues or feature through GitHub.
Developed by Daymenion with β€οΈ