Skip to content

Eagle-Rock-Analytics/historical-obs-platform

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Historical Observations Data Platform

License: GPL v3 Code style: black DOI:10.5281/zenodo.16370140

Code for Eagle Rock Analytics' cloud-based, historical weather observations data platform

The Historical Observations Data Platform is a cloud-based, historical weather observations data platform to enable California's energy sector access to high-quality, open climate and weather data. This work is supported by California Energy Commission grant PIR-19-006. This repository contains the code (via Python scripts and Jupyter Notebooks) associated with the full processing pipeline for data ingestion into the Historical Data Platform.

The Platform responds to community partner needs in understanding weather and climate information including the severity, duration, frequency, and rate of change over time of extreme weather events, as well as supporting projections downscaling efforts. We implement stringent Quality Assurance/Quality Control (QA/QC) procedures in line with international protocols and with customized modifications relevant to energy sector (such as temperature and precipitation extremes, winds, and solar radiation).

Warning

This project is still is under active development.

📊 About the data

The Platform has sourced station data from from 27 publicly available historical data observation networks across the Western Electricity Coordinating Council (WECC) domain from 1980-2022 (time period varies between networks and stations). 14,927 stations total have completed the full quality control and standardization pipelines and are publically available as cloud-optimized zarrs in Amazon s3 storage.

The following figure shows the locations of all the stations (by network) that have completed our quality control and standardization process:

Station coverage map

And here you can see the number of observations throughout the project's time period: Merge stations over time


🗂 Repository Structure

historical-obs-platform/
├── data/                      # Miscellaneous supporting data
├── data-access/               # Code examples for accessing our data
├── environment/               # Files for building the computational environment, including a README with further instructions
├── figures/                   # Visualizations
├── notebooks/                 # Jupyter notebooks for data visualization and analysis 
├── scripts/                   # Data processing code for all steps of the QAQC process 
│   ├── 1_pull_data/           # Scripts for retrieving/scrape network station data from their respective sources 
│   ├── 2_clean_data/          # Scripts for cleaning individual networks to a consistent standard
│   ├── 3_qaqc_data/           # Scripts to QA/QC stations 
│   ├── 4_merge_data/          # Scripts to close out processing, and standardize to hourly timesteps. Data at conclusion have been fully processed.
│   ├── misc/                  # Scripts that don't fit into any other categories
│   ├── pcluster/              # Code and shell scripts for running QAQC and merge scripts in an AWS pcluster environment 
│   └── tests/                 # Scripts for testing finalized data products
└──    

🛠️ Computational Environment

See the environment folder for instructions and files for building the computational environment for this project.

🔏 License

This project is licensed under the GNU GPLv3 - see the LICENSE file for details.

🙋 Support

🧑‍💻 Contributors

Contributors