A comprehensive Python toolkit for retrieving, processing, and visualizing satellite data from multiple sources: Sentinel-5P, MODIS, and ERA5. This toolkit focuses on atmospheric data including air pollutants (NO₂, CO, SO₂, O₃, HCHO), aerosol optical depth, and meteorological parameters.
-
Multi-platform Support:
- Sentinel-5P: Trace gases and air pollutants
- MODIS: Aerosol optical depth (AOD) measurements
- ERA5: Reanalysis of atmospheric, land, and oceanic climate variables
-
Unified Data Access:
- Automated data retrieval from Copernicus Open Access Hub, NASA Earthdata, and Climate Data Store
- Consistent API across different data sources
-
Advanced Processing:
- Quality control and filtering
- Spatial interpolation and regridding
- Temporal aggregation
- Station-based data extraction
-
Visualization:
- High-quality concentration and parameter maps
- Customizable geographic boundaries
- Time series analysis capabilities
-
Resource Management:
- Efficient download handling with caching
- Built-in file retention management
Before using this toolkit, you need to complete the following steps:
-
Copernicus Account (for Sentinel-5P and ERA5):
- Register for a free account at Copernicus Open Access Hub
- For ERA5, also register at Climate Data Store
-
NASA Earthdata Account (for MODIS):
- Register at NASA Earthdata
-
Environment Configuration:
- Create a
.env
file in the project root directory with your credentials:# Sentinel-5P credentials COPERNICUS_USERNAME=your_username COPERNICUS_PASSWORD=your_password # ERA5 credentials CDSAPI_URL=https://cds.climate.copernicus.eu/api/v2 CDSAPI_KEY=your_key # NASA Earthdata credentials EARTHDATA_USERNAME=your_username EARTHDATA_PASSWORD=your_password
- Create a
# Clone the repository
git clone https://github.com/Alex870521/Satellite_S5P.git
# Navigate to the directory
cd Satellite_S5P
# Install required packages
pip install -r requirements.txt
"""SENTINEL-5P Data Processing Example"""
from datetime import datetime
from src.api import SENTINEL5PHub
# 1. Set parameters
start_date, end_date = datetime(2025, 3, 1), datetime(2025, 3, 13)
# File class: 'NRTI' (Near Real-Time) or 'OFFL' (Offline processed)
file_class = 'NRTI'
# Available file types: 'NO2___', 'O3____', 'CO____', 'SO2___', 'CH4___', 'CLOUD_', 'AER_AI'
file_type = 'NO2___'
# Define region boundary (min_lon, max_lon, min_lat, max_lat)
boundary = (120, 122, 22, 25)
# 2. Create data hub instance
sentinel_hub = SENTINEL5PHub(max_workers=3)
# 3. Fetch data
products = sentinel_hub.fetch_data(
file_class=file_class,
file_type=file_type,
start_date=start_date,
end_date=end_date,
boundary=boundary
)
# 4. Download data
sentinel_hub.download_data(products)
# 5. Process data
sentinel_hub.process_data()
"""MODIS Data Processing Example"""
from datetime import datetime
from src.api import MODISHub
# 1. Set parameters
start_date, end_date = datetime(2025, 3, 1), datetime(2025, 3, 12)
# Product types: 'MOD04' (Terra) or 'MYD04' (Aqua)
modis_product_type = "MYD04"
# 2. Create data hub instance
modis_hub = MODISHub()
# 3. Fetch data
products = modis_hub.fetch_data(
file_type=modis_product_type,
start_date=start_date,
end_date=end_date
)
# 4. Download data
modis_hub.download_data(products)
# 5. Process data
modis_hub.process_data()
"""ERA5 Data Processing Example"""
from datetime import datetime
from src.api import ERA5Hub
# 1. Set parameters
start_date, end_date = datetime(2025, 3, 1), datetime(2025, 3, 19)
# Variables to retrieve (more options available)
variables = ['boundary_layer_height']
# Pressure levels in hPa (set to None for surface data only)
pressure_levels = None
# Region boundary (min_lon, max_lon, min_lat, max_lat)
boundary = (119, 123, 21, 26)
# Define observation stations
STATIONS = [
{"name": "FS", "lat": 22.6294, "lon": 120.3461}, # Kaohsiung Fengshan
{"name": "NZ", "lat": 22.7422, "lon": 120.3339}, # Kaohsiung Nanzi
{"name": "TH", "lat": 24.1817, "lon": 120.5956}, # Taichung
{"name": "TP", "lat": 25.0330, "lon": 121.5654} # Taipei
]
# 2. Create data hub instance
era5_hub = ERA5Hub(timezone='Asia/Taipei')
# 3. Fetch data
era5_hub.fetch_data(
start_date=start_date,
end_date=end_date,
boundary=boundary,
variables=variables,
pressure_levels=pressure_levels,
stations=STATIONS,
)
# 4. Download data
era5_hub.download_data()
# 5. Process data
era5_hub.process_data()
- Provider: European Space Agency (ESA)
- Products: NO₂, O₃, CO, SO₂, HCHO, Cloud, Aerosol Index
- Resolution: 7 km x 3.5 km (at nadir)
- Frequency: Daily global coverage
- Provider: NASA
- Products: Aerosol Optical Depth (AOD)
- Platforms: Terra (MOD04) and Aqua (MYD04) satellites
- Resolution: 10 km at nadir
- Frequency: 1-2 days global coverage
- Provider: European Centre for Medium-Range Weather Forecasts (ECMWF)
- Products: Reanalysis dataset with 100+ atmospheric, land and oceanic parameters
- Resolution: 0.25° x 0.25° global grid (about 31 km)
- Frequency: Hourly data, monthly updates
All data sources follow a consistent workflow:
- Data Discovery: Query available products based on date range and region
- Download Management: Efficient parallel downloading with error handling
- Quality Control: Filtering based on data quality flags
- Spatial Processing:
- Sentinel-5P: RBF interpolation of sparse satellite data
- MODIS: Processing of gridded AOD values
- ERA5: Extraction of point values for weather stations
- Visualization: Generation of standardized maps and plots
- Export: Structured data storage in NetCDF and CSV formats
The toolkit includes built-in data retention management to prevent disk space issues:
- Automatically cleans files older than the configured retention period
- Maintains directory structure while removing outdated files
- Can be scheduled for periodic execution or triggered manually
For bug reports and feature requests please visit GitHub Issues.