Problem Statement:
E-commerce companies accumulate massive customer transaction data but often fail to utilize it effectively for strategic decision-making. Segmenting customers based on their purchasing behavior can improve marketing ROI, customer retention, and lifetime value.
Objectives:
- Apply RFM (Recency, Frequency, Monetary) analysis to segment customers.
- Use Pareto (80/20) Analysis to identify top contributors to revenue.
- Build a Power BI dashboard for interactive visualization and reporting.
- Enable data-driven decision-making for targeted marketing strategies.
- Source: Kaggle - E-Commerce Dataset
- File Name:
ecomm_data.csv
- Fields:
InvoiceNo
,StockCode
,Description
,Quantity
InvoiceDate
,UnitPrice
,CustomerID
,Country
- Contains transaction-level data from a UK-based online retailer over the course of a year.
- Python (Pandas, NumPy, Matplotlib, Seaborn)
- Power BI – Dashboard development
- Jupyter Notebook – Analysis and documentation
- Excel – Data preparation (where applicable)
- Removed rows with null
CustomerID
and negativeQuantity
. - Converted
InvoiceDate
to datetime and set analysis snapshot date.
- Recency: Days since the customer’s last purchase.
- Frequency: Number of purchases.
- Monetary: Total spend.
- Created R, F, and M scores using quantiles (1–5 scale).
- Combined scores to form RFM segments (e.g., 555 = Champions).
- Identified top 20% of customers contributing ~80% of revenue.
- Visualized revenue contribution by segment and top customers.
- Built a dynamic dashboard to explore:
- Segment-wise revenue contributions
- Customer distribution
- Revenue by day of the week
- RFM Heatmap
- Top/Bottom customers
- Time-series trend of active customers
- 65%+ customers are dormant, requiring reactivation campaigns.
- Top 20% customers contribute to nearly 80% of revenue (Pareto Principle).
- Thursdays recorded the highest revenue generation.
- RFM Heatmap revealed strong clusters of high-spending frequent customers.