Skip to content

This project is basically a data analysis on the dajubhai minimart data in which i have applied some of the exploratory data analysis and applied ml models and predict the sales and loyality member

Notifications You must be signed in to change notification settings

supremkc05/DajuBhaiMiniMart_Analysis

Repository files navigation

Dajubhai MiniMart Analysis

This repository contains the Jupyter Notebook for dajubhai minimart analysis. The notebook demonstrates a full data science workflow: from data cleaning to model building using a dataset named DajuBhaiMinimart.csv.


📂 Files

  • ITS69304_SupremKhatri_Individual_Assignment.ipynb — Main notebook with code and analysis.
  • DajuBhaiMinimart.csv — Dataset (assumed provided externally).

📝 Notebook Overview

1️⃣ Importing Libraries

The following libraries are used:

import pandas as pd
import matplotlib.pyplot as plt
import plotly.express as px
import seaborn as sns
import warnings
warnings.filterwarnings("ignore")
import plotly.io as pio
pio.renderers.default = 'notebook'

2️⃣ Data Loading

df = pd.read_csv('DajuBhaiMinimart.csv')
df.head()
  • Reads and displays initial data.

3️⃣ Data Cleaning

  • Checks data info:
    df.info()
  • Checks for missing values:
    df.isnull().sum()
  • Drops unnecessary columns:
    df.drop('CustomerID', axis=1, inplace=True)
  • Renames columns where necessary.

4️⃣ Data Visualization

  • Uses Seaborn and Plotly to visualize relationships and distributions.
  • Example plots:
    sns.countplot(x='ProductCategory', data=df)
    px.histogram(df, x='Sales', color='ProductCategory')

5️⃣ Feature Engineering

  • Creates or modifies features to improve model input.
  • Handles categorical encoding, scaling, etc.

6️⃣ Model Training

  • Trains models such as:
    • Gradient Boosting Regressor
  • Example code:
    from sklearn.ensemble import GradientBoostingRegressor
    model = GradientBoostingRegressor()
    model.fit(X_train, y_train)

7️⃣ Hyperparameter Tuning & Cross-Validation

  • Performs grid search / cross-validation:
    from sklearn.model_selection import GridSearchCV
    params = {'n_estimators': [100, 200], 'learning_rate': [0.1, 0.05]}
    grid = GridSearchCV(GradientBoostingRegressor(), params, cv=5)
    grid.fit(X_train, y_train)

🚀 How to Run

✅ Clone or download this repository.

✅ Install requirements:

pip install pandas matplotlib seaborn plotly scikit-learn

✅ Run the notebook:

jupyter notebook ITS69304_SupremKhatri_Individual_Assignment.ipynb

⚠️ Notes

  • Ensure DajuBhaiMinimart.csv is present in the notebook directory.
  • All visualizations are rendered using Plotly + Seaborn + Matplotlib.
  • The notebook suppresses warnings for cleaner output.

💡 Future Work

  • Extend analysis with additional models (e.g. XGBoost, Random Forest).
  • Improve feature engineering with domain knowledge.
  • Deploy model via Flask / Streamlit for interactive use.

About

This project is basically a data analysis on the dajubhai minimart data in which i have applied some of the exploratory data analysis and applied ml models and predict the sales and loyality member

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published