This project performs an in-depth analysis of a retail sales dataset. It includes data preprocessing, exploratory data analysis (EDA), customer segmentation, and visualization of key business insights using Python and Plotly.
- Data loading and preprocessing
- Handling missing values and understanding data types
- Customer segmentation based on age groups
- Analyzing purchase patterns by different demographic segments
- Identifying the most popular product categories
- Determining revenue generation trends over time
- Interactive visualizations using Plotly
The dataset used in this project contains the following columns:
Transaction ID
: Unique identifier for each transactionDate
: Date of the transactionCustomer ID
: Unique identifier for each customerGender
: Gender of the customer (Male/Female)Age
: Age of the customerProduct Category
: Category of the purchased productQuantity
: Number of units purchasedPrice per Unit
: Price of a single unitTotal Amount
: Total cost of the transaction
Ensure you have Python installed along with the required libraries.
Install dependencies using:
pip install pandas numpy plotly
- Clone the repository:
git clone https://github.com/mehtadigisha/Descriptive-and-Predictive-Analysis-with-Interactive-Dashboard.git
- Navigate to the project folder:
cd Descriptive-and-Predictive-Analysis-with-Interactive-Dashboard
- Run the script:
python Descriptive and Predictive Analysis with Interactive Dashboard.ipynb
- Customer Segmentation: Classifies customers into four age groups - Child, Teenager, Adult, and Senior Citizen.
- Popular Product Categories: Identifies the most purchased product categories by each age group.
- Revenue Analysis: Determines which product category generates the highest revenue.
- Gender-based Spending: Analyzes whether males or females contribute more to total sales.
- Time-based Trends: Tracks total sales trends over time.