Skip to content

katerinaharana/Housing-Price-Prediction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

16 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

🏠 Housing Price Prediction

This project aims to predict housing prices using machine learning models and analyze the importance of housing features. The dataset contains structured information about real estate properties including area, number of rooms, location factors, and furnishing status.


πŸ“ Project Structure

housing-price-prediction/
β”‚
β”œβ”€β”€ data/
β”‚   β”œβ”€β”€ Housing.csv                  # Original dataset
β”‚   └── housing_cleaned.csv          # Cleaned dataset with additional features
β”‚
β”œβ”€β”€ notebooks/
β”‚   β”œβ”€β”€ 01_data_preparation.ipynb    # Data cleaning, encoding, outlier detection
β”‚   └── 02_model_training.ipynb      # Model training, evaluation, feature importance
β”‚
β”œβ”€β”€ requirements.txt                 # Required Python libraries
└── README.md                        

πŸ“Š Dataset Overview

The dataset includes:

  • area, bedrooms, bathrooms, stories
  • mainroad, guestroom, basement, etc.
  • furnishingstatus: encoded in multiple formats (ordinal and one-hot)
  • Engineered features:
    • area_per_bedroom
    • area_per_bathroom
    • is_fully_equipped

Workflow Summary

1. Data Cleaning & Encoding

  • Converted object columns to numeric/boolean
  • Used one-hot and ordinal encoding strategies
  • Detected and removed outliers using Isolation Forest
  • Created engineered features

2. Model Training

Trained and evaluated the following models:

Model Encoding Used
Linear Regression One-Hot
Ridge Regression One-Hot
Random Forest Regressor Ordinal
Gradient Boosting Regressor Ordinal
XGBoost Regressor Ordinal

All models were compared using:

  • RMSE
  • MAE
  • RΒ² Score
  • Actual vs Predicted scatter plots

3. Feature Importance

  • Coefficients (linear models)
  • Feature importances (tree models)

Next Steps

  • Fine-tune hyperparameters
  • Use SHAP values for explainability
  • Explore ensemble and stacking models
  • Deploy a trained model as an API

About

Housing price prediction: data preprocessing, encoding, feature engineering & model comparison

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published