This is a beginner-friendly machine learning project that predicts housing prices using various features of a house (e.g., size, bedrooms etc). The project includes data cleaning, feature encoding, model training, and evaluation.
The dataset contains housing features such as:
- Area
- Location on main road
- Number of bedrooms and bathrooms
- Furnishing status
- etc
β All categorical data is label-encoded for model training.
- Python π
- Pandas & NumPy (for data manipulation)
- Matplotlib & Seaborn (for visualization)
- Scikit-learn (for ML modeling and evaluation)
- Jupyter Notebook (for development)
- Checked for missing values
- Visualized price distribution
- Generated a correlation heatmap to find the most relevant features
- Linear Regression was used to predict housing prices.
- Evaluated using:
- Mean Absolute Error (MAE)
- Mean Squared Error (MSE)
- RΒ² Score
- Mean Absolute Error (MAE): 970043.40 Mean Squared Error (MSE): 1754318687330.66 RΒ² Score: 0.65
π Not optimized β this is a baseline model. Future improvements can include feature engineering, outlier handling, and advanced models like Random Forest or XGBoost.
- Clone this repository
- Open the notebook in Jupyter or VS Code
- Run each cell to follow the full workflow
- Based on learnings from Andrew Ngβs ML Course
- Dataset from kaggle: https://www.kaggle.com/datasets/harishkumardatalab/housing-price-prediction
Created by Piyush
π§ Drop a message or connect with me on LinkedIn