This project implements a machine learning solution for personalized ad targeting based on customer marketing data. The system analyzes customer demographics, purchase behaviors, and campaign responses to predict which customers are most likely to respond to future marketing campaigns.
Marketing campaigns can be expensive to run, and sending promotions to customers who are unlikely to respond wastes resources. This project aims to:
- Identify which customers are most likely to respond to marketing campaigns
- Understand key factors that influence customer response
- Provide data-driven insights to optimize marketing strategy
PyML_Personalized-Ads/
├── Documentation/ # Project documentation files
│ ├── Personalized Ad-Analytics for small business.docx
│ ├── Personalized Ad-Analytics for small business.pdf
│ └── Personalized Ad-Analytics for small business.pptx
├── Input_Data/ # Input datasets
│ ├── EDA/ # Raw data for exploration
│ │ └── marketing_data.csv # Original marketing dataset
│ └── Modelling/ # Preprocessed data for modeling
│ └── clean_Market_Data.csv # Cleaned dataset
├── Output_Charts/ # Visualizations generated by the analysis
│ ├── EDA/ # Exploratory data analysis charts
│ └── Modelling/ # Model performance visualizations
├── Output_Metrics/ # Model evaluation metrics
│ └── Modelling/ # Classification reports and metrics
├── Python_Code/ # Jupyter notebooks with analysis code
│ ├── EDA/ # Exploratory data analysis
│ │ └── Market_Analytics_EDA.ipynb
│ └── Modelling/ # Predictive modeling
│ └── Market_Analytics.ipynb
└── README.md # Project overview (this file)
The dataset contains customer information including:
- Demographics (age, education, marital status, income)
- Purchase behavior (spending on different product categories)
- Channel preferences (web, store, catalog)
- Previous campaign responses
- Customer relationship duration
The EDA process includes:
- Data cleaning and preprocessing
- Missing value imputation
- Feature transformation
- Statistical analysis
- Correlation analysis
- Data visualization
- Created aggregate features like total purchases, total spending
- Transformed categorical variables
- Calculated customer tenure
- Analyzed customer segments
The project implements and compares three ML algorithms:
- Decision Tree Classifier
- K-Nearest Neighbors (KNN)
- Random Forest Classifier
- Class imbalance handling using SMOTE and Random Under Sampling
- Hyperparameter tuning with Grid Search
- Cross-validation using K-Fold
- Learning curve analysis
- Feature importance analysis
- Random Forest performed best among the tested models
- Income, total purchase amount, and customer age are strong predictors
- Customers with children at home show different response patterns
- Marital status influences campaign response rates
- Web purchase behavior correlates with campaign responsiveness
- Python 3.10+
- Required libraries: pandas, numpy, scikit-learn, matplotlib, seaborn, imbalanced-learn
-
Clone the repository:
git clone https://github.com/jay-singhvi/PyML_Personalized-Ads.git cd PyML_Personalized-Ads
-
Run the notebooks in sequence:
- First, run
Python_Code/EDA/Market_Analytics_EDA.ipynb
for data exploration - Then, run
Python_Code/Modelling/Market_Analytics.ipynb
for model building
- First, run
- Review the charts in
Output_Charts/
directory for visual insights - Examine model performance metrics in
Output_Metrics/
directory - The best-performing model can be used to predict customer responses to future campaigns
- Implement more advanced models (e.g., XGBoost, Neural Networks)
- Add time-series analysis for temporal patterns
- Develop a recommendation system for personalized product offers
- Deploy model as a web service for real-time predictions
- Incorporate A/B testing framework for campaign optimization