Skip to content

Code-Crafters-BM/Machine_learning_101

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

4 Commits
Β 
Β 
Β 
Β 

Repository files navigation

🏑 California Housing Price Prediction

πŸ“Œ Overview

This project implements a California Housing Price Prediction model using Linear Regression and Random Forest Regressor. The dataset used is the California Housing Dataset from sklearn.datasets. The goal is to predict the median house prices based on various features such as income level, house age, and geographical location.

πŸ“‚ Contents

1️⃣ Data Preparation (Cellule 2 & 3)

  • Dataset Loading: The California Housing dataset is loaded using fetch_california_housing(as_frame=True).
  • Feature Engineering: βœ… Standardization of the dataset using StandardScaler. βœ… Splitting the data into training (80%) and testing (20%) sets. βœ… Checking dataset structure and statistics.

2️⃣ Model Training & Evaluation

πŸ“Š Linear Regression Model (Cellule 4)

  • Model Training: βœ… Uses LinearRegression() to fit the training data.
  • Predictions: βœ… Predictions are made on the test set.
  • Evaluation Metrics: βœ… Mean Absolute Error (MAE): Measures the average absolute differences between predicted and actual prices. βœ… Mean Squared Error (MSE): Penalizes large errors more significantly.

🌲 Random Forest Model (Cellule 5)

  • Model Training: βœ… Uses RandomForestRegressor(n_estimators=100, random_state=42).
  • Predictions: βœ… Predictions are made using the trained Random Forest model.
  • Evaluation Metrics: βœ… MSE, MAE, and RΒ² score are calculated to compare model performance.

πŸ”¬ Results & Observations

Model Mean Squared Error (MSE) Mean Absolute Error (MAE) RΒ² Score
Linear Regression 0.55 0.53 -
Random Forest 0.26 0.33 0.81

Conclusion: The Random Forest model outperforms Linear Regression with a lower MSE and MAE, and a high RΒ² Score (0.81), indicating it captures more variance in the dataset.

πŸ— Installation

To run this project, install the required dependencies:

pip install pandas numpy scikit-learn matplotlib

πŸš€ Usage

1️⃣ Clone the repository:

git clone https://github.com/your-repo/California_Housing_Price_Prediction.git
cd California_Housing_Price_Prediction

2️⃣ Run the Python script in a Jupyter Notebook:

jupyter notebook

3️⃣ Execute the cells step by step to see the data processing, model training, and evaluation.

🀝 Contributors

  • Code Crafters Bm – Project development and implementation.

πŸ’‘ Acknowledgments

  • Inspired by sklearn.datasets and regression modeling techniques.

About

Our notebook from our latest course about machine learning

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published