This project conducts in-depth analysis and visualization on a bike sales dataset, identifying key insights such as customer revenue distribution, age group purchasing trends, profit margins by region, and inter-feature correlations.
- What is the distribution of unit cost and profit?
- How does revenue vary across age groups?
- What age group is the most profitable?
- What are the relationships (correlations) between numerical features?
- Which countries have the highest revenue?
- What are the buying behaviors across different age categories?
- Pandas for data manipulation
- Matplotlib and Seaborn for visualizations
- NumPy for numerical operations
- Google Colab for notebook development
- Density plots of
Unit_Cost
- Box plots for
Profit
across age groups - Heatmaps for correlation matrix
- Scatter plots of
Customer_Age
vs.Revenue
- Group-wise mean calculations (age + country)
- Adult (35-64) customers generate the highest revenue.
- France's revenue increased by 10% using simulation.
- Highest correlation exists between
Unit_Cost
andUnit_Price
. - Outliers exist in
Order_Quantity
, andProfit
.
notebooks/
: Jupyter notebooksdata/
: Raw or cleaned datasetsimages/
: Visuals used in README or reportsoutputs/
: Text summaries or final reports
- Clone the repository
- Install dependencies via
pip install -r requirements.txt
- Launch the notebook using
jupyter notebook
- Open
Bike_sales_analysis.ipynb
- Correlation Matrix Heatmap
- Boxplot for Profit per Age Group
- Revenue Distribution per Country
- Apply machine learning for sales prediction
- Deploy dashboard using Streamlit
- Analyze seasonal trends
[Your Name] - Data Analyst & Python Enthusiast