In today's fast-paced digital marketplace, finding genuine deals among thousands of daily offers is like finding a needle in a haystack. Traditional deal-hunting approaches face several critical limitations:
- Information Overload: Deal sites publish hundreds of offers daily across multiple categories
- Price Validation Challenge: It's nearly impossible to manually verify if a "deal" price represents genuine value
- Time Intensive: Manually scanning, evaluating, and comparing deals is extremely time-consuming
- Missed Opportunities: Great deals often expire quickly before discovery
- Subjective Evaluation: Human bias affects deal assessment and can lead to poor purchasing decisions
The AI Deal Agent Framework revolutionizes deal discovery by deploying a sophisticated multi-agent AI system that:
๐ Automatically Discovers deals from multiple RSS feeds across various product categories
๐ง Intelligently Evaluates prices using ensemble machine learning models and fine-tuned LLMs
โก Instantly Alerts users when genuine opportunities are identified (>$50 savings)
๐ฑ Delivers Notifications via SMS/WhatsApp for immediate action
๐ฏ Eliminates False Positives through rigorous AI-powered price validation
- Ensemble Price Intelligence: Combines 3 different AI models (fine-tuned LLM, RAG with vector similarity, Random Forest) for robust price estimation
- Real-time Deal Curation: Uses GPT-4 with structured outputs to filter and summarize only high-quality deals
- Duplicate Prevention: Memory system ensures users never receive alerts for previously seen deals
- Scalable Architecture: Multi-agent design allows easy addition of new deal sources and pricing models
This system transforms deal hunting from a manual, time-intensive process into an automated, intelligent service that works 24/7 to identify genuine opportunities, allowing users to make informed purchasing decisions without the research overhead.
The AI Deal Agent Framework is a sophisticated multi-agent system designed to automatically discover, evaluate, and alert users about lucrative deals from various online sources. The system combines multiple AI models, machine learning techniques, and real-time data processing to identify opportunities where products are priced significantly below their estimated market value.
The AI Deal Agent Framework operates through a sophisticated 7-step workflow that runs continuously to identify lucrative deals:
The Scanner Agent initiates the process by:
-
Multi-Source Scraping: Monitors 5 RSS feeds from DealNews covering:
- Electronics (
/c142/Electronics/
) - Computers (
/c39/Computers/
) - Automotive (
/c238/Automotive/
) - Smart Home (
/f1912/Smart-Home/
) - Home & Garden (
/c196/Home-Garden/
)
- Electronics (
-
Content Extraction: For each RSS entry:
- Fetches the full deal page using HTTP requests
- Extracts detailed product information using BeautifulSoup
- Parses title, summary, features, and pricing details
- Cleans HTML content and normalizes text formatting
-
Memory Filtering: Compares scraped deals against
memory.json
to avoid duplicate processing
The Scanner Agent uses OpenAI GPT-4o-mini with Structured Outputs to:
-
Quality Assessment: Evaluates deals based on:
- Description detail and clarity (4-5 sentence minimum)
- Price clarity and confidence (must be explicit, not "% off")
- Product specificity (avoids vague descriptions)
-
Content Standardization:
- Rephrases descriptions to focus on product features, not deal terms
- Extracts numerical prices from various formats
- Filters out deals with unclear or missing pricing
-
Top 5 Selection: Returns the 5 most promising deals with detailed descriptions
The Ensemble Agent coordinates three independent pricing models for robust estimation:
- Model: Llama 3.1 8B fine-tuned specifically for pricing
- Hosting: Modal cloud with GPU acceleration and 4-bit quantization
- Approach: Domain-specific price prediction based on product descriptions
- Strengths: Deep understanding of product value and market context
- Vector Search: Uses ChromaDB with sentence transformer embeddings
- Context Retrieval: Finds 5 most similar products from training data
- LLM Integration: OpenAI/DeepSeek with retrieved context for informed pricing
- Strengths: Leverages similar product comparisons for accurate estimates
- Model: scikit-learn Random Forest trained on vectorized descriptions
- Features: Sentence transformer embeddings (all-MiniLM-L6-v2)
- Approach: Statistical pattern recognition from product text
- Strengths: Baseline ML reliability and fast inference
The Ensemble Agent combines individual predictions using:
- Linear Regression: Trained weights for optimal model combination
- Statistical Features: Min, max, and individual predictions as inputs
- Robust Output: Weighted average that leverages each model's strengths
- Validation: Ensures non-negative price estimates
The Planning Agent processes each deal to:
- Discount Calculation:
discount = estimated_price - deal_price
- Opportunity Ranking: Sorts deals by discount amount (highest first)
- Threshold Filtering: Only considers deals with >$50 potential savings
- Best Deal Selection: Identifies the top opportunity from the batch
When a qualifying opportunity is found, the Messaging Agent:
-
Multi-Channel Alerts: Sends notifications via:
- SMS: Direct text messages through Twilio
- WhatsApp: Rich messaging with deal details
- Pushover: Push notifications (optional)
-
Formatted Content: Includes:
- Product description summary
- Current deal price vs estimated value
- Discount amount and percentage
- Direct link to the deal
-
Immediate Delivery: Real-time notifications for time-sensitive deals
The system maintains state through:
- Deal History: Updates
memory.json
with processed deal URLs - Duplicate Prevention: Ensures users never receive alerts for the same deal twice
- Vector Database: Persists ChromaDB embeddings for consistent similarity search
- Model Caching: Maintains warm Modal services to prevent cold starts
# Runs indefinitely, checking for new deals every cycle
while True:
opportunities = planner.plan(memory=load_memory())
if opportunities:
save_to_memory(opportunities)
time.sleep(scan_interval)
# One-time check for immediate opportunities
planner = PlanningAgent(collection)
opportunity = planner.plan(memory=load_memory())
- Processing Speed: ~2-3 minutes per complete workflow cycle
- API Efficiency: Batched processing minimizes API calls
- Memory Usage: Vector database cached locally for fast similarity search
- Scalability: Modal auto-scaling handles traffic spikes
- Graceful Degradation: System continues if individual models fail
- API Fallbacks: Switches between OpenAI and DeepSeek automatically
- Network Resilience: Retries failed HTTP requests with exponential backoff
- Data Validation: Strict type checking with Pydantic models
This workflow ensures that only high-quality, genuinely discounted deals reach users, while maintaining system reliability and performance efficiency.
The framework consists of several specialized agents working in coordination:
- Role: Master orchestrator that coordinates all other agents
- Color: Green ๐ข
- Functions:
- Manages the complete workflow from deal discovery to notification
- Coordinates between Scanner, Ensemble, and Messaging agents
- Filters deals based on discount threshold ($50 minimum)
- Prioritizes opportunities by discount amount
- Role: Deal discovery and content curation
- Color: Cyan ๐ต
- Functions:
- Scrapes RSS feeds from DealNews across multiple categories
- Uses OpenAI GPT-4o-mini with structured outputs to select best deals
- Filters deals based on description quality and price clarity
- Avoids duplicate deals using memory system
- Role: Advanced price estimation using multiple models
- Color: Yellow ๐ก
- Functions:
- Coordinates three different pricing models
- Uses linear regression to combine predictions optimally
- Provides robust price estimates through model averaging
- Role: Fine-tuned LLM pricing specialist
- Color: Red ๐ด
- Functions:
- Connects to Modal-hosted fine-tuned Llama 3.1 8B model
- Provides domain-specific pricing expertise
- Uses quantized model for efficient inference
- Role: RAG-based pricing with similar product context
- Color: Blue ๐ต
- Functions:
- Performs vector similarity search in ChromaDB
- Uses OpenAI/DeepSeek with context from 5 similar products
- Employs sentence transformers for semantic similarity
- Role: Traditional ML approach to pricing
- Color: Magenta ๐ฃ
- Functions:
- Uses scikit-learn Random Forest model
- Vectorizes product descriptions using sentence transformers
- Provides baseline ML predictions
- Role: Multi-channel notification system
- Color: White โช
- Functions:
- Sends SMS/WhatsApp alerts via Twilio
- Optional Pushover push notifications
- Formatted deal alerts with key metrics
- Python 3.8+: Primary programming language
- Modal: Serverless GPU hosting for fine-tuned models
- ChromaDB: Vector database for similarity search
- OpenAI/DeepSeek: LLM APIs for deal analysis
- Twilio: Communication platform for alerts
- scikit-learn: Machine learning models
- Transformers: Hugging Face model ecosystem
- BeautifulSoup: Web scraping and HTML parsing
twilio # SMS/WhatsApp notifications
python-dotenv # Environment variable management
chromadb # Vector database
scikit-learn # Machine learning models
numpy # Numerical computations
bs4 # Web scraping
feedparser # RSS feed parsing
openai # OpenAI API client
modal # Serverless model hosting
sentence-transformers # Text embeddings
datasets # Data handling
matplotlib # Visualization (testing)
deals_agents/
โโโ agents/ # Agent modules
โ โโโ agent.py # Base agent class with logging
โ โโโ planning_agent.py # Main orchestrator
โ โโโ scanner_agent.py # Deal discovery
โ โโโ ensemble_agent.py # Model coordination
โ โโโ specialist_agent.py # Fine-tuned LLM
โ โโโ frontier_agent.py # RAG-based pricing
โ โโโ random_forest_agent.py # ML pricing
โ โโโ messaging_agent.py # Notifications
โ โโโ deals.py # Data structures
โโโ products_vectorstore/ # ChromaDB storage
โโโ venv/ # Virtual environment
โโโ config.py # Configuration management
โโโ pricer_service.py # Modal service definition
โโโ deal_agent_framework.py # Main application entry
โโโ items.py # Product data processing
โโโ testing.py # Model evaluation framework
โโโ memory.json # Deal history storage
โโโ requirements.txt # Python dependencies
โโโ ensemble_model.pkl # Trained ensemble weights
โโโ random_forest_model.pkl # Trained RF model
โโโ train.pkl # Training dataset
โโโ test.pkl # Testing dataset
โโโ README.md # This file
# Clone the repository
git clone <repository-url>
cd deals_agents
# Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
Create a .env
file with the following configuration:
# Required API Keys
OPENAI_API_KEY=your_openai_api_key_here
HUGGINGFACE_TOKEN=your_huggingface_token_here
# Optional API Keys
DEEPSEEK_API_KEY=your_deepseek_api_key_here # Alternative to OpenAI
# Twilio Configuration (for notifications)
TWILIO_ACCOUNT_SID=your_twilio_account_sid
TWILIO_AUTH_TOKEN=your_twilio_auth_token
TWILIO_FROM=your_twilio_phone_number
MY_PHONE_NUMBER=your_destination_phone_number
TWILIO_CONTENT_SID=your_whatsapp_template_sid # Optional
# Pushover Configuration (alternative notifications)
PUSHOVER_USER=your_pushover_user_key
PUSHOVER_TOKEN=your_pushover_app_token
# Install Modal CLI
pip install modal
# Set up Modal authentication
modal token new
# Deploy the pricing service
modal deploy pricer_service.py
The ChromaDB vector store will be automatically created when you first run the system. Ensure you have the training data (train.pkl
) in the root directory.
# Run the complete agent framework
python deal_agent_framework.py
# Test the Modal service connection
python test_modal.py
# Prevent Modal cold starts (run in background)
python keep_warm.py
from agents.planning_agent import PlanningAgent
import chromadb
# Initialize ChromaDB
client = chromadb.PersistentClient(path="products_vectorstore")
collection = client.get_or_create_collection('products')
# Create and run planning agent
planner = PlanningAgent(collection)
opportunities = planner.plan(memory=[])
print(f"Found {len(opportunities)} opportunities")
- Deal Discovery: Scanner Agent scrapes RSS feeds from DealNews
- Content Curation: OpenAI filters and summarizes promising deals
- Price Estimation: Ensemble Agent coordinates three pricing models:
- Specialist: Fine-tuned Llama model on Modal
- Frontier: RAG with similar products from ChromaDB
- Random Forest: Traditional ML on product vectors
- Model Fusion: Linear regression combines individual predictions
- Opportunity Detection: Deals with >$50 discount are flagged
- Alert Generation: Messaging Agent sends notifications via SMS/WhatsApp
The framework includes comprehensive testing utilities:
from testing import Tester
import joblib
# Load test data
test_data = joblib.load('test.pkl')
# Test any pricing function
def my_pricing_function(item):
return item.price * 1.1 # Example function
# Run evaluation
Tester.test(my_pricing_function, test_data)
- Average Error: Mean absolute deviation from true price
- RMSLE: Root Mean Squared Logarithmic Error
- Hit Rate: Percentage of predictions within 20% of true price
- Color-coded Results: Green (good), Orange (okay), Red (poor)
- Categories: Electronics, Computers, Automotive, Smart Home, Home & Garden
- Quality Threshold: Minimum description detail and price clarity
- Discount Threshold: $50 minimum for notifications
- Memory System: Avoids duplicate alerts
- Specialist Model: Llama 3.1 8B fine-tuned for pricing
- Vector Model: all-MiniLM-L6-v2 for embeddings
- Context Window: 5 similar products for RAG
- Token Limits: 150-160 tokens for product descriptions
# In messaging_agent.py
DO_TEXT = True # Enable SMS/WhatsApp
DO_PUSH = False # Enable Pushover notifications
USE_WHATSAPP = True # Use WhatsApp instead of SMS
- Quantization: 4-bit quantization for Specialist model
- Caching: ChromaDB persistence for vector storage
- Batching: Bulk processing of deal selections
- Warm-up: Keep Modal service active to prevent cold starts
- Color-coded Logging: Each agent has distinct colors
- Structured Logging: Timestamps and agent identification
- Error Handling: Graceful fallbacks for API failures
- Memory Persistence: JSON-based deal history
- Environment variable storage with
.env
files - Graceful degradation when optional keys are missing
- Clear error messages for required configurations
- Local vector database storage
- No persistent storage of personal data
- RSS feed data only (publicly available deals)
# Test Modal connectivity
python test_modal.py
# Re-authenticate if needed
modal token new
# Reinstall requirements
pip install -r requirements.txt --force-reinstall
# Clear and reinitialize vector database
rm -rf products_vectorstore/
# Run framework again to rebuild
- OpenAI: Monitor usage in OpenAI dashboard
- Twilio: Check account balance and rate limits
- DeepSeek: Switch to OpenAI if DeepSeek fails
- Web Dashboard: Real-time monitoring and deal history
- Additional Sources: Amazon, eBay, other deal sites
- Smart Filtering: User preference learning
- Price History: Tracking deal evolution over time
- Fine-tuning: Domain-specific model training
- Ensemble Weights: Dynamic weight adjustment
- Category Specialists: Product-category-specific models
- Real-time Learning: Continuous model updates
For users interested in exploring enhanced web search capabilities, check out our Tavily-powered version of the Deal Agent:
๐ Tavily Branch
This alternative implementation integrates Tavily's real-time web search API to enhance deal discovery and price validation with:
- Real-time Market Research: Live web searches for current product pricing
- Enhanced Price Validation: Cross-reference deals against multiple online sources
- Broader Deal Discovery: Search beyond RSS feeds to find hidden opportunities
- Dynamic Market Insights: Real-time competitor pricing and availability data
The Tavily version provides a more comprehensive approach to deal hunting by leveraging live web data alongside the existing AI ensemble models.
This project is licensed under the MIT License. See the LICENSE file for details.
Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests if applicable
- Submit a pull request
For support, please reach out through any of the following channels:
- Name: Ankit Malik
- Phone: +91 8449035579
- Portfolio: https://personal-portfolio-gamma-red.vercel.app/
For technical issues or feature requests:
- Open an issue in the GitHub repository
- Contact the development team through the portfolio website
- Direct message on provided contact number for urgent matters
Note: This system requires active API keys and proper configuration to function. Please ensure all environment variables are set correctly before running the framework.