Author: Hon Wa Ng
Date: October 2024
This repository implements regression-based techniques to analyze and predict bike-sharing demand. The project explores Ordinary Least Squares (OLS) regression, Locally Weighted Regression (LWR), and Poisson regression to model rental counts based on weather, time, and seasonal features.
The dataset used in this analysis is included in the repository under the data/ directory.
- Perform exploratory data analysis (EDA) on bike rental data.
- Apply OLS regression to model rental demand.
- Implement Locally Weighted Regression (LWR) for non-linear modeling.
- Develop a Poisson regression model to predict count-based outcomes.
- Compare model performance using R² and D² scores.
Bike-Sharing-Regression/
│── data/ # Dataset storage
│ ├── bike+sharing+dataset # Original dataset
│ ├── hour.csv # Hourly bike rental dataset
│
│── docs/ # Documentation files
│ ├── assignment_questions.pdf # Problem statement
│ ├── project_writeup.pdf # Detailed project analysis
│
│── src/ # Source code
│ ├── main.py # Main execution script
│
│── LICENSE # MIT License
│── requirements.txt # Dependencies for running the project
│── README.md # Project documentation
git clone https://github.com/Edwardnhw/Bike-Sharing-Regression.git
cd Bike-Sharing-Regression
Ensure you have Python installed (>=3.7), then run:
pip install -r requirements.txt
Execute the main script to run regression models:
python src/main.py
The script will:
- Load the dataset (hour.csv).
- Perform exploratory data analysis (EDA).
- Train and evaluate OLS, LWR, and Poisson regression models.
- Output R² and D² scores for performance evaluation.
- Exploratory Data Analysis (EDA)
- Summary statistics and missing value analysis.
- Feature correlation heatmap.
- Visualizing rental demand patterns.
- Feature Engineering & Selection
- Removing redundant columns (instant, atemp, registered, casual, dteday).
- One-hot encoding categorical variables (season, month, hour, weekday, weather).
- Regression Models Implemented
- Ordinary Least Squares (OLS) Regression
- Closed-form solution using matrix inversion.
- Feature selection using one-hot encoding.
- Locally Weighted Regression (LWR)
- Assigns weights to data points based on proximity.
- Adaptive predictions using bandwidth parameter τ.
- Poisson Regression
- Gradient descent implementation.
- Uses Tweedie deviance (D²) for performance evaluation.
- Temperature and hour of the day are the most significant predictors of bike rentals.
- Poisson regression outperforms OLS for count-based predictions.
- Feature engineering improves model accuracy.
- LWR captures non-linear patterns but is computationally expensive.
This project is licensed under the MIT License.