This project presents an explanatory data analysis (EDA) on a UK-based online retail dataset sourced from the UCI Machine Learning Repository. The goal was to extract actionable insights that help the retailer make strategic business decisions, including optimizing inventory, adjusting pricing strategies, and improving marketing campaigns.
- Source: UCI – Online Retail II Dataset
- Size: ~500,000 records
- Features include: Invoice number, product code, product description, quantity, invoice date, unit price, customer ID, and country
The analysis focused on understanding key trends and business patterns by addressing the following objectives:
- Identifying customer purchasing behavior and key customer segments
- Analyzing sales trends across different products and regions
- Understanding the impact of seasonal and promotional events on sales
- Addressing data quality issues (e.g., missing values, duplicates) to ensure clean and reliable insights
- Python (Jupyter Notebook)
- pandas (data manipulation)
- numpy (numerical operations)
- plotly (interactive visualizations)
- The final explanatory data analysis report showcasing cleaned data, visualizations, and actionable insights
- Code for data cleaning, transformation, and visual storytelling
- A small number of repeat customers contribute significantly to the overall revenue
- Seasonal and event-based sales trends align with holidays and promotional periods
- Certain product categories and regions show untapped sales opportunities
- Data analysis reveals operational inefficiencies, including stock imbalances and pricing disparities across regions
Lohith Basavanahalli Anjinappa