To apply core data science techniques to analyze and predict patterns in climate change using global temperature data and related indicators.
- Dataset selection from Kaggle
- Data cleaning in Excel and Tableau Prep
- Visual exploration using matplotlib and seaborn
- Predictive modeling using Random Forest Regression
- Climate anomaly detection with Z-scores and clustering
- Climate Change Indicators (FAO)
- Global Weather Repository
- Others evaluated but excluded: IMDb, Reddit Data is Ugly, Sleep Dataset
- Python (Pandas, Seaborn, Scikit-learn, Matplotlib)
- Jupyter Notebook
- Tableau Prep & Excel (for initial cleaning)
- GitHub for version control
- Model: Random Forest Regression
- Mean Squared Error: 0.1050 (on values ranging from -1 to 3 °C)
- Add ARIMA time-series prediction
- Integrate weather dataset more deeply
- Explore neural networks (TensorFlow)
MIT License - see the LICENSE file for details.