A comprehensive system for fine-tuning AI models on Islamic knowledge from the Quran and authentic Hadiths.
Project overview: A detailed overview of the project highlighting the goals and purpose of it.
A user-friendly web interface for training AI models on Islamic knowledge. (Backend focued only)
chmod +x run_gradio.sh
./run_gradio.sh
# Install requirements
pip3 install -r requirements.txt
# Set OpenAI API key (optional for data management)
export OPENAI_API_KEY='your-openai-api-key'
# Launch interface
python3 launch_gradio.py
- File Upload: Upload JSON and TXT files for training data
- Web Scraping: Extract content from websites automatically
- Manual Entry: Add training examples through web forms
- Sample Generation: Create sample Islamic Q&A data
- Real-time Stats: View training data statistics
- Data Validation: Check format compatibility
- Train/Validation Split: Prepare data for training
- Export Options: Download data as CSV
- One-click Training: Start fine-tuning with OpenAI
- Progress Monitoring: Check training job status
- Model Management: List and organize trained models
- Interactive Testing: Test models with custom questions
- Islamic Knowledge: Specialized for Quran and Hadith
- Response Evaluation: Review model performance
- Upload JSON/TXT files
- Scrape websites for content
- Add manual training examples
- Generate sample data
- View current data overview
- Split data for training/validation
- Validate data format
- Export data to CSV
- Start fine-tuning process
- Monitor training progress
- Check job status
- List available models
- Test models interactively
- Evaluate responses
The web scraper can:
- Extract content from any website
- Handle multiple pages
- Clean and format text
- Save content to files
- Respect website policies with delays
- HTML pages
- Articles and blog posts
- Islamic educational content
- Q&A websites
- Documentation sites
[
{
"question": "What are the five pillars of Islam?",
"answer": "The five pillars are...",
"source": "Sahih al-Bukhari",
"reference": "8",
"category": "Pillars of Islam"
}
]
- Automatically saved to
scraped_content/
directory - Manual processing required for Q&A extraction
- Can be used as reference material
OPENAI_API_KEY
: Required for training featuresGRADIO_SERVER_PORT
: Custom port (default: 7860)
project/
βββ data/ # Training data files
βββ scraped_content/ # Web scraped content
βββ logs/ # Training logs
βββ models/ # Model information
βββ gradio_app.py # Main interface
- Respectful Scraping: Includes delays between requests
- Content Validation: Manual review recommended
- Islamic Authenticity: Verify all religious content
- Privacy: No data sent to external services (except OpenAI for training)
-
Port Already in Use
# Change port in launch_gradio.py interface.launch(server_port=7861)
-
Missing Dependencies
pip33 install -r requirements.txt
-
OpenAI API Issues
- Verify API key is set correctly
- Check API quota and billing
- Ensure key has fine-tuning permissions
-
Web Scraping Blocked
- Some sites block automated requests
- Try different URLs or manual content entry
- Respect robots.txt and terms of service
For issues or questions:
- Check the console output for error messages
- Verify all requirements are installed
- Ensure OpenAI API key is valid
- Review scraped content for quality
May this tool serve the Ummah in spreading authentic Islamic knowledge through AI.