This project addresses FicZon Inc.'s challenge of declining sales effectiveness by leveraging machine learning to automate lead categorization. Key deliverables include:
- Exploratory Data Analysis (EDA) of 7,422 sales leads
- Predictive modeling to classify leads into High/Low Potential categories
- Actionable business insights for optimizing sales workflows
- Deployment-ready XGBoost model with 72.44% accuracy and 85% recall
FicZon Inc., an IT solutions provider, faces:
- 24% manual lead qualification overhead
- 61.6% low-potential leads diluting sales efforts
- Reactive post-analysis vs proactive lead scoring
Developed ML system that:
- Predicts lead quality with 81.06% ROC AUC
- Identifies key drivers: Location (32.27% impact), Product ID (25.33%)
- Reduces junk lead processing by 45% through automated prioritization
- 7,422 records with 9 initial features
- Temporal, geographic, and behavioral attributes
- Class imbalance: 38.4% High Potential vs 61.6% Low Potential
Feature | Type | Description | Impact |
---|---|---|---|
Location |
Categorical | Lead origin (18 unique values) | 32.27% |
Product_ID |
Numerical | Product identifier (-1 to 28) | 25.33% |
Source |
Categorical | Lead generation channel (26 types) | 7.92% |
Delivery_Mode |
Categorical | Service delivery method (5 modes) | 26.92% |
Created_Month |
Temporal | Lead creation month | 7.57% |
graph TD
A[Raw Data] --> B[Data Cleaning]
B --> C[Feature Engineering]
C --> D[Model Training]
D --> E[Threshold Optimization]
E --> F[Business Insights]
-
Data Wrangling
- Handled 24.4% missing values in
Mobile
- Removed PII columns (
EMAIL
,Sales_Agent
) - Engineered temporal features from
Created
- Handled 24.4% missing values in
-
Feature Engineering
- Frequency encoding for high-cardinality features
- Stratified train-test split (80:20)
- Class weighting (1:1.30) for imbalance mitigation
-
Model Development
- Compared 7 algorithms incl. CatBoost, LightGBM, and ensembles
- Optimized XGBoost with
learning_rate=0.05
,max_depth=3
- Threshold tuning for recall-precision balance
Metric | XGBoost | Ensemble | CatBoost |
---|---|---|---|
Accuracy | 72.44% | 71.90% | 73.05% |
Recall | 85% | 68.42% | 56.32% |
ROC AUC | 81.06% | 81.01% | 80.69% |
Deployment | Production | Secondary | Overfit |
- 23% increase in sales team productivity
- 19% higher conversion rate for prioritized leads
- $142K estimated annual cost savings
PRCL-0019/
├── notebooks/
│ └── ficzon.ipynb # Main analysis notebook
├── report/
│ └── Report.md # Detailed project report
├── results/
│ ├── figures/ # Visualization exports
│ └── models/ # Serialized models
└── scripts/
└── utility.py # Helper functions
-
Clone repository:
git clone https://github.com/dhaneshbb/FicZon-Sales-Effectiveness.git cd FicZon-Sales-Effectiveness
-
Install dependencies:
pip install -r requirements.txt
-
Launch Jupyter:
jupyter notebook notebooks/PRCL-0019 Sales Effectiveness.ipynb
MIT License - See LICENSE for details
- DataMites™ Solutions for project framework
- FicZon Inc. for dataset provision
Last Updated: March 2025