Skip to content

ayoub-chaieb/Historical-Weather-Forecast-Comparison-to-Actuals

Repository files navigation

🌦️ Casablanca Daily Weather ETL Pipeline using Bash

✅ Project Overview

This project successfully implements an automated ETL (Extract, Transform, Load) process that collects, transforms, and stores weather data for Casablanca, Morocco, focusing on evaluating the accuracy of daily temperature forecasts.


🧭 Project Goals (All Achieved)

Task Status
Initialize weather log file (rx_poc.log) ✅ Completed
Write ETL Bash script to download, parse, and log weather data ✅ Completed
Automate script execution daily at noon via cron ✅ Completed
Write accuracy analysis script (fc_accuracy.sh) ✅ Completed
Calculate daily forecast accuracy and label it ✅ Completed
Append daily accuracy report to historical_fc_accuracy.tsv ✅ Completed
Generalize script to compute historical accuracy for multiple days ✅ Completed
Create weekly stats script to report min/max absolute forecast error ✅ Completed

____________________________________________________________________________________________________________________________
Weather report: Casablanca

     \  /       Partly cloudy
   _ /"".-.     21 °C          
     \_(   ).   ↘ 5 km/h       
     /(___(__)  10 km          
                0.0 mm         
                                                       ┌─────────────┐                                                       
┌──────────────────────────────┬───────────────────────┤  Sat 05 Jul ├───────────────────────┬──────────────────────────────┐
│            Morning           │             Noon      └──────┬──────┘     Evening           │             Night            │
├──────────────────────────────┼──────────────────────────────┼──────────────────────────────┼──────────────────────────────┤
│     \   /     Sunny          │     \   /     Sunny          │     \   /     Sunny          │     \   /     Clear          │
│      .-.      +23(25) °C     │      .-.      +25(26) °C     │      .-.      +23(25) °C     │      .-.      21 °C          │
│   ― (   ) ―   ↓ 8-9 km/h     │   ― (   ) ―   ↘ 15-17 km/h   │   ― (   ) ―   ↘ 15-17 km/h   │   ― (   ) ―   ↓ 8-11 km/h    │
│      `-’      10 km          │      `-’      10 km          │      `-’      10 km          │      `-’      10 km          │
│     /   \     0.0 mm | 0%    │     /   \     0.0 mm | 0%    │     /   \     0.0 mm | 0%    │     /   \     0.0 mm | 0%    │
└──────────────────────────────┴──────────────────────────────┴──────────────────────────────┴──────────────────────────────┘
                                                       ┌─────────────┐                                                       
┌──────────────────────────────┬───────────────────────┤  Sun 06 Jul ├───────────────────────┬──────────────────────────────┐
│            Morning           │             Noon      └──────┬──────┘     Evening           │             Night            │
├──────────────────────────────┼──────────────────────────────┼──────────────────────────────┼──────────────────────────────┤
│     \   /     Sunny          │     \   /     Sunny          │     \   /     Sunny          │     \   /     Clear          │
│      .-.      +23(25) °C     │      .-.      +25(26) °C     │      .-.      +24(26) °C     │      .-.      22 °C          │
│   ― (   ) ―   ↘ 8-9 km/h     │   ― (   ) ―   ↘ 15-17 km/h   │   ― (   ) ―   ↘ 14-16 km/h   │   ― (   ) ―   ↘ 8-12 km/h    │
│      `-’      10 km          │      `-’      10 km          │      `-’      10 km          │      `-’      10 km          │
│     /   \     0.0 mm | 0%    │     /   \     0.0 mm | 0%    │     /   \     0.0 mm | 0%    │     /   \     0.0 mm | 0%    │
└──────────────────────────────┴──────────────────────────────┴──────────────────────────────┴──────────────────────────────┘
                                                       ┌─────────────┐                                                       
┌──────────────────────────────┬───────────────────────┤  Mon 07 Jul ├───────────────────────┬──────────────────────────────┐
│            Morning           │             Noon      └──────┬──────┘     Evening           │             Night            │
├──────────────────────────────┼──────────────────────────────┼──────────────────────────────┼──────────────────────────────┤
│     \   /     Sunny          │     \   /     Sunny          │     \   /     Sunny          │  _`/"".-.     Patchy rain ne…│
│      .-.      +24(25) °C     │      .-.      +26(27) °C     │      .-.      +24(26) °C     │   ,\_(   ).   +22(25) °C     │
│   ― (   ) ―   → 8-10 km/h    │   ― (   ) ―   ↘ 14-16 km/h   │   ― (   ) ―   ↘ 15-17 km/h   │    /(___(__)  ↘ 5-7 km/h     │
│      `-’      10 km          │      `-’      10 km          │      `-’      10 km          │      ‘ ‘ ‘ ‘  10 km          │
│     /   \     0.0 mm | 0%    │     /   \     0.0 mm | 0%    │     /   \     0.0 mm | 0%    │     ‘ ‘ ‘ ‘   0.0 mm | 72%   │
└──────────────────────────────┴──────────────────────────────┴──────────────────────────────┴──────────────────────────────┘
Location: Casablanca ⵜⵉⴳⵎⵉ ⵜⵓⵎⵍⵉⵍⵜ الدار البيضاء, préfecture d'arrondissements de Casablanca-Anfa عمالة مقاطعات الدار البيضاء أنفا, Pachalik de Casablanca, Préfecture de Casablanca عمالة الدار البيضاء, Casablanca-Settat ⵜⵉⴳⵎⵉ ⵜⵓⵎⵍⵉⵍⵜ-ⵙⵟⵟⴰⵜ الدار البيضاء-سطات, ⵍⵎⵖⵔⵉⴱ المغرب [33.5949733,-7.6188008]

Follow @igor_chubin for wttr.in updates
_____________________________________________________________________________________________________________________________

📈 ETL Report Sample (rx_poc.log)

year    month   day     obs_temp    fc_temp
2023    01      01      10          11
2023    01      02      11          12
2023    01      03      12          10
...

🧠 Accuracy Analysis Output (historical_fc_accuracy.tsv)

year    month   day     obs_temp    fc_temp    accuracy    accuracy_range
2023    01      02      11          12         -1          excellent
2023    01      03      12          10         2           good
2023    01      04      13          13         0           excellent
...

📊 Weekly Stats Example Output

Minimum absolute forecast error (last 7 days): 0
Maximum absolute forecast error (last 7 days): 3

🔁 Automation Details

The ETL process (rx_poc.sh) is executed daily at noon local time in Casablanca using a cron job configured as follows:

0 12 * * * /path/to/rx_poc.sh >> /path/to/rx_poc.log 2>&1

This ensures consistent, real-time collection of weather data for ongoing accuracy monitoring.


The system was built as a **proof-of-concept** to monitor and measure discrepancies between forecasted and observed temperatures, forming the foundational block for a broader analytics initiative.

## 📌 Objectives Achieved

* ✔️ Automated the extraction of **daily weather data** using `curl` from [wttr.in](https://wttr.in).
* ✔️ Parsed and transformed raw text output to isolate:

  * Observed temperature at **noon (local time)**.
  * Forecasted temperature for **noon the following day**.
* ✔️ Loaded clean, structured data into a **tab-separated log file**, forming a historical report.
* ✔️ Scheduled the process using a **cron job**, ensuring daily execution at the specified time.
* ✔️ Designed the output in a **tabular format**, ready for further analysis and modeling.

## 🧪 Sample Output

```plaintext
year    month   day     obs_tmp fc_temp
2023    01      01      10       11
2023    01      02      11       12
2023    01      03      12       10
2023    01      04      13       13
2023    01      05      10       9
2023    01      06      11       10

🛠 Technologies Used

  • 🐧 Bash Shell Scripting
  • Cron Scheduler
  • 🌐 curl for HTTP-based data retrieval
  • 📦 wttr.in as the weather data provider

📈 Future Enhancements

While this POC focused on a single city and data source, the architecture is scalable and ready to support:

  • 🌍 Multiple locations
  • 📡 Multiple forecast sources
  • ⏱️ Configurable update frequencies
  • 🌪️ Additional weather metrics (wind, visibility, precipitation...)

🧪 Learning Outcomes

By completing this project, the following competencies were reinforced:

  • ✅ Bash scripting for automation
  • ✅ Web scraping with curl
  • ✅ Text parsing and data cleaning with Unix tools
  • ✅ Cron job scheduling
  • ✅ Forecast accuracy calculation and categorization
  • ✅ Weekly statistical reporting with basic Bash logic

About

Automated ETL pipeline to scrape, parse and log daily Casablanca weather and forecasts via wttr.in

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages