Skip to content

Commit 88fa819

Browse files
authored
Probability forecasts (#207)
* Intial commit * Run black * Minor fixes * Rewrite with convolutions * Documentation * Clip values btw 0 and 1 To avoid rounding errors * Add basic testing * Cleanup test * Remove duplicated line * Improve docstrings * Workaround for TypeError issue with Pillow 8.3 * Small refactoring Improve docstrings/comments; assign treshold value to missing data * Add constraints to 'timesteps' argument * Use >= thr instead of > thr * More testing * Add example * Fix black * Fix example * Add lagrangian probability method * Improve docstrings
1 parent 69061dd commit 88fa819

File tree

8 files changed

+375
-11
lines changed

8 files changed

+375
-11
lines changed

doc/source/pysteps_reference/nowcasts.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@ Implementation of deterministic and ensemble nowcasting methods.
88
.. automodule:: pysteps.nowcasts.interface
99
.. automodule:: pysteps.nowcasts.anvil
1010
.. automodule:: pysteps.nowcasts.extrapolation
11+
.. automodule:: pysteps.nowcasts.lagrangian_probability
1112
.. automodule:: pysteps.nowcasts.sprog
1213
.. automodule:: pysteps.nowcasts.sseps
1314
.. automodule:: pysteps.nowcasts.steps

doc/source/references.bib

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -84,6 +84,17 @@ @ARTICLE{GZ2002
8484
DOI = "10.1175/1520-0493(2002)130<2859:SDOTPO>2.0.CO;2"
8585
}
8686

87+
@ARTICLE{GZ2004,
88+
AUTHOR = "U. Germann and I. Zawadzki",
89+
TITLE = "Scale-Dependence of the Predictability of Precipitation from Continental Radar Images. {P}art {II}: Probability Forecasts",
90+
JOURNAL = "Journal of Applied Meteorology",
91+
VOLUME = 43,
92+
NUMBER = 1,
93+
PAGES = "74--89",
94+
YEAR = 2004,
95+
DOI = "10.1175/1520-0450(2004)043<0074:SDOTPO>2.0.CO;2"
96+
}
97+
8798
@ARTICLE{Her2000,
8899
AUTHOR = "H. Hersbach",
89100
TITLE = "Decomposition of the Continuous Ranked Probability Score for Ensemble Prediction Systems",

doc/source/user_guide/machine_learning_pysteps.rst

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -7,16 +7,16 @@ How to correctly compare the accuracy of machine learning against traditional no
77
Before starting the comparison, you need to ask yourself what is the objective of nowcasting:
88

99
#. Do you only want to minimize prediction errors?
10-
#. Do you also want to represent the prediction uncertainty?
10+
#. Do you also want to represent the prediction uncertainty?
1111

1212
To achieve objective 1, it is sufficient to produce a single deterministic nowcast that filters out the unpredictable small-scale precipitation features.
1313
However, this will create a nowcast that will become increasingly smooth over time.
1414

1515
To achieve objective 2, you need to produce a probabilistic or an ensemble nowcast (several ensemble members and realizations).
1616

17-
In weather forecasting (and nowcasting) we usually want to achieve both goals because it is impossible to predict the evolution of a chaotic system with 100% accuracy, especially space-time precipitation fields and thunderstorms!
17+
In weather forecasting (and nowcasting) we usually want to achieve both goals because it is impossible to predict the evolution of a chaotic system with 100% accuracy, especially space-time precipitation fields and thunderstorms!
1818

19-
Machine learning and pysteps offer several methods to produce both deterministic and probabilistic nowcasts.
19+
Machine learning and pysteps offer several methods to produce both deterministic and probabilistic nowcasts.
2020
Therefore, if you want to compare machine learning-based nowcasts to simpler extrapolation-based models, you need to select the right method.
2121

2222
1. Deterministic nowcasting
@@ -27,7 +27,7 @@ Deterministic nowcasts can be divided into:
2727
a. Variance-preserving nowcasts, such as extrapolation nowcasts by Eulerian and Lagrangian persistence.
2828
b. Error-minimization nowcasts, such as machine learning, Fourier-filtered and ensemble mean nowcasts.
2929

30-
**Very important**: these two types of deterministic nowcasts are not directly comparable because they have a different variance!
30+
**Very important**: these two types of deterministic nowcasts are not directly comparable because they have a different variance!
3131
This is best explained by the decomposition of the mean squared error (MSE):
3232

3333
:math:`MSE = bias^2 + Var`
@@ -41,7 +41,7 @@ Therefore, it is better to avoid directly comparing an error-minimization machin
4141
A deterministic equivalent of the ensemble mean can be approximated using the modules :py:mod:`pysteps.nowcasts.sprog` or :py:mod:`pysteps.nowcasts.anvil`.
4242
Another possibility, but more computationally demanding, is to average many ensemble members generated by the modules :py:mod:`pysteps.nowcasts.steps` or :py:mod:`pysteps.nowcasts.sseps`.
4343

44-
Still, even by using the pysteps ensemble mean, it is not given that its variance will be the same as the one of machine learning predictions.
44+
Still, even by using the pysteps ensemble mean, it is not given that its variance will be the same as the one of machine learning predictions.
4545
Possible solutions are to:
4646

4747
#. use a normalized MSE (NMSE) or another score accounting for differences in the variance between prediction and observation.
@@ -94,10 +94,10 @@ The comparison of methods from different types should only be done carefully and
9494
- MSE, RMSE, MAE, ETS, etc or better normalized scores, etc
9595
* - Probabilistic (quantile-based)
9696
- Quantile ANN, quantile random forests, quantile regression
97-
- Probabilities derived from :py:mod:`pysteps.nowcasts.steps`/:py:mod:`~pysteps.nowcasts.sseps`
97+
- :py:mod:`pysteps.nowcasts.lagrangian_probability` or probabilities derived from :py:mod:`pysteps.nowcasts.steps`/:py:mod:`~pysteps.nowcasts.sseps`
9898
- Reliability diagram (predicted vs observed quantile), probability integral transform (PIT) histogram
9999
* - Probabilistic (ensemble-based)
100100
- GANs, VAEs, etc
101101
- Ensemble and probabilities derived from :py:mod:`pysteps.nowcasts.steps`/:py:mod:`~pysteps.nowcasts.sseps`
102-
- Probabilistic verification: reliability diagrams, continuous ranked probability scores (CRPS), etc.
102+
- Probabilistic verification: reliability diagrams, continuous ranked probability scores (CRPS), etc.
103103
Ensemble verification: rank histograms, spread-error relationships, etc

examples/probability_forecast.py

Lines changed: 130 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,130 @@
1+
#!/bin/env python
2+
"""
3+
Probability forecasts
4+
=====================
5+
6+
This example script shows how to forecast the probability of exceeding an
7+
intensity threshold.
8+
9+
The method is based on the local Lagrangian approach described in Germann and
10+
Zawadzki (2004).
11+
"""
12+
13+
import matplotlib.pyplot as plt
14+
import numpy as np
15+
16+
from pysteps.nowcasts.lagrangian_probability import forecast
17+
18+
###############################################################################
19+
# Numerical example
20+
# -----------------
21+
22+
# parameters
23+
precip = np.zeros((100, 100))
24+
precip[10:50, 10:50] = 1
25+
velocity = np.ones((2, *precip.shape))
26+
timesteps = [0, 2, 6, 12]
27+
thr = 0.5
28+
slope = 1 # pixels / timestep
29+
30+
# compute probability forecast
31+
out = forecast(precip, velocity, timesteps, thr, slope=slope)
32+
# plot
33+
for n, frame in enumerate(out):
34+
plt.subplot(2, 2, n + 1)
35+
plt.imshow(frame, interpolation="nearest", vmin=0, vmax=1)
36+
plt.title(f"t={timesteps[n]}")
37+
plt.xticks([])
38+
plt.yticks([])
39+
plt.show()
40+
41+
###############################################################################
42+
# Real-data example
43+
# -----------------
44+
45+
from datetime import datetime
46+
47+
from pysteps import io, rcparams, utils
48+
from pysteps.motion.lucaskanade import dense_lucaskanade
49+
from pysteps.verification import reldiag_init, reldiag_accum, plot_reldiag
50+
51+
# data source
52+
source = rcparams.data_sources["mch"]
53+
root = rcparams.data_sources["mch"]["root_path"]
54+
fmt = rcparams.data_sources["mch"]["path_fmt"]
55+
pattern = rcparams.data_sources["mch"]["fn_pattern"]
56+
ext = rcparams.data_sources["mch"]["fn_ext"]
57+
timestep = rcparams.data_sources["mch"]["timestep"]
58+
importer_name = rcparams.data_sources["mch"]["importer"]
59+
importer_kwargs = rcparams.data_sources["mch"]["importer_kwargs"]
60+
61+
# read precip field
62+
date = datetime.strptime("201607112100", "%Y%m%d%H%M")
63+
fns = io.find_by_date(date, root, fmt, pattern, ext, timestep, num_prev_files=2)
64+
importer = io.get_method(importer_name, "importer")
65+
precip, __, metadata = io.read_timeseries(fns, importer, **importer_kwargs)
66+
precip, metadata = utils.to_rainrate(precip, metadata)
67+
# precip[np.isnan(precip)] = 0
68+
69+
# motion
70+
motion = dense_lucaskanade(precip)
71+
72+
# parameters
73+
nleadtimes = 6
74+
thr = 1 # mm / h
75+
slope = 1 * timestep # km / min
76+
77+
# compute probability forecast
78+
extrap_kwargs = dict(allow_nonfinite_values=True)
79+
fct = forecast(
80+
precip[-1], motion, nleadtimes, thr, slope=slope, extrap_kwargs=extrap_kwargs
81+
)
82+
83+
# plot
84+
for n, frame in enumerate(fct):
85+
plt.subplot(2, 3, n + 1)
86+
plt.imshow(frame, interpolation="nearest", vmin=0, vmax=1)
87+
plt.xticks([])
88+
plt.yticks([])
89+
plt.show()
90+
91+
################################################################################
92+
# Let's plot one single leadtime in more detail.
93+
94+
plt.imshow(fct[2], interpolation="nearest", vmin=0, vmax=1)
95+
plt.xticks([])
96+
plt.yticks([])
97+
plt.colorbar(label="Exceedance probability, P(R>=1 mm/h)")
98+
plt.show()
99+
100+
###############################################################################
101+
# Verification
102+
# ------------
103+
104+
# verifying observations
105+
importer = io.get_method(importer_name, "importer")
106+
fns = io.find_by_date(
107+
date, root, fmt, pattern, ext, timestep, num_next_files=nleadtimes
108+
)
109+
obs, __, metadata = io.read_timeseries(fns, importer, **importer_kwargs)
110+
obs, metadata = utils.to_rainrate(obs, metadata)
111+
obs[np.isnan(obs)] = 0
112+
113+
# reliability diagram
114+
reldiag = reldiag_init(thr)
115+
reldiag_accum(reldiag, fct, obs[1:])
116+
fig, ax = plt.subplots()
117+
plot_reldiag(reldiag, ax)
118+
ax.set_title("Reliability diagram")
119+
plt.show()
120+
121+
122+
###############################################################################
123+
# References
124+
# ----------
125+
# Germann, U. and I. Zawadzki, 2004:
126+
# Scale Dependence of the Predictability of Precipitation from Continental
127+
# Radar Images. Part II: Probability Forecasts.
128+
# Journal of Applied Meteorology, 43(1), 74-89.
129+
130+
# sphinx_gallery_thumbnail_number = 3

pysteps/nowcasts/interface.py

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -33,12 +33,15 @@
3333

3434
from pysteps.extrapolation.interface import eulerian_persistence
3535
from pysteps.nowcasts import anvil, sprog, steps, sseps, extrapolation
36+
from pysteps.nowcasts import lagrangian_probability
3637

3738
_nowcast_methods = dict()
3839
_nowcast_methods["anvil"] = anvil.forecast
3940
_nowcast_methods["eulerian"] = eulerian_persistence
4041
_nowcast_methods["extrapolation"] = extrapolation.forecast
4142
_nowcast_methods["lagrangian"] = extrapolation.forecast
43+
_nowcast_methods["probability"] = lagrangian_probability.forecast
44+
_nowcast_methods["lagrangian_probability"] = lagrangian_probability.forecast
4245
_nowcast_methods["sprog"] = sprog.forecast
4346
_nowcast_methods["sseps"] = sseps.forecast
4447
_nowcast_methods["steps"] = steps.forecast
@@ -65,6 +68,9 @@ def get_method(name):
6568
| lagrangian or | this approach extrapolates the last observation |
6669
| extrapolation | using the motion field (Lagrangian persistence) |
6770
+-----------------+-------------------------------------------------------+
71+
| lagrangian_\ | this approach computes local lagrangian probability |
72+
| probability | forecasts of threshold exceedences. |
73+
+-----------------+-------------------------------------------------------+
6874
| sprog | the S-PROG method described in :cite:`Seed2003` |
6975
+-----------------+-------------------------------------------------------+
7076
| steps | the STEPS stochastic nowcasting method described in |
@@ -74,9 +80,6 @@ def get_method(name):
7480
| sseps | short-space ensemble prediction system (SSEPS). |
7581
| | Essentially, this is a localization of STEPS. |
7682
+-----------------+-------------------------------------------------------+
77-
78-
steps and sseps produce stochastic nowcasts, and the other methods are
79-
deterministic.
8083
"""
8184
if isinstance(name, str):
8285
name = name.lower()
Lines changed: 131 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,131 @@
1+
# -*- coding: utf-8 -*-
2+
"""
3+
pysteps.nowcasts.lagrangian_probability
4+
======================================
5+
6+
Implementation of the local Lagrangian probability nowcasting technique
7+
described in :cite:`GZ2004`.
8+
9+
.. autosummary::
10+
:toctree: ../generated/
11+
12+
forecast
13+
"""
14+
15+
import numpy as np
16+
from scipy.signal import convolve
17+
18+
from pysteps.nowcasts import extrapolation
19+
20+
21+
def forecast(
22+
precip,
23+
velocity,
24+
timesteps,
25+
threshold,
26+
extrap_method="semilagrangian",
27+
extrap_kwargs=None,
28+
slope=5,
29+
):
30+
"""
31+
Generate a probability nowcast by a local lagrangian approach. The ouput is
32+
the probability of exceeding a given intensity threshold, i.e.
33+
P(precip>=threshold).
34+
35+
Parameters
36+
----------
37+
precip: array_like
38+
Two-dimensional array of shape (m,n) containing the input precipitation
39+
field.
40+
velocity: array_like
41+
Array of shape (2,m,n) containing the x- and y-components of the
42+
advection field. The velocities are assumed to represent one time step
43+
between the inputs.
44+
timesteps: int or list of floats
45+
Number of time steps to forecast or a sorted list of time steps for which
46+
the forecasts are computed (relative to the input time step).
47+
The number of time steps has to be a positive integer.
48+
The elements of the list are required to be in ascending order.
49+
threshold: float
50+
Intensity threshold for which the exceedance probabilities are computed.
51+
slope: float, optional
52+
The slope of the relationship between optimum scale and lead time in
53+
pixel / timesteps. Germann and Zawadzki (2004) found the optimal slope
54+
to be equal to 1 km / minute.
55+
56+
Returns
57+
-------
58+
out: ndarray
59+
Three-dimensional array of shape (num_timesteps, m, n) containing a time
60+
series of nowcast exceedence probabilities. The time series starts from
61+
t0 + timestep, where timestep is taken from the advection field velocity.
62+
63+
References
64+
----------
65+
Germann, U. and I. Zawadzki, 2004:
66+
Scale Dependence of the Predictability of Precipitation from Continental
67+
Radar Images. Part II: Probability Forecasts.
68+
Journal of Applied Meteorology, 43(1), 74-89.
69+
"""
70+
# Compute deterministic extrapolation forecast
71+
if isinstance(timesteps, int) and timesteps > 0:
72+
timesteps = np.arange(1, timesteps + 1)
73+
elif not isinstance(timesteps, list):
74+
raise ValueError(f"invalid value for argument 'timesteps': {timesteps}")
75+
precip_forecast = extrapolation.forecast(
76+
precip,
77+
velocity,
78+
timesteps,
79+
extrap_method,
80+
extrap_kwargs,
81+
)
82+
83+
# Ignore missing values
84+
nanmask = np.isnan(precip_forecast)
85+
precip_forecast[nanmask] = threshold - 1
86+
valid_pixels = (~nanmask).astype(float)
87+
88+
# Compute exceedance probabilities using a neighborhood approach
89+
precip_forecast = (precip_forecast >= threshold).astype(float)
90+
for i, timestep in enumerate(timesteps):
91+
scale = timestep * slope
92+
if scale == 0:
93+
continue
94+
kernel = _get_kernel(scale)
95+
kernel_sum = convolve(
96+
valid_pixels[i, ...],
97+
kernel,
98+
mode="same",
99+
)
100+
precip_forecast[i, ...] = convolve(
101+
precip_forecast[i, ...],
102+
kernel,
103+
mode="same",
104+
)
105+
precip_forecast[i, ...] /= kernel_sum
106+
precip_forecast = np.clip(precip_forecast, 0, 1)
107+
precip_forecast[nanmask] = np.nan
108+
return precip_forecast
109+
110+
111+
def _get_kernel(size):
112+
"""
113+
Generate a circular kernel.
114+
115+
Parameters
116+
----------
117+
size : int
118+
Size of the circular kernel (its diameter). For size < 5, the kernel is
119+
a square instead of a circle.
120+
121+
Returns
122+
-------
123+
2-D array with kernel values
124+
"""
125+
middle = max((int(size / 2), 1))
126+
if size < 5:
127+
return np.ones((size, size), dtype=np.float32)
128+
else:
129+
xx, yy = np.mgrid[:size, :size]
130+
circle = (xx - middle) ** 2 + (yy - middle) ** 2
131+
return np.asarray(circle <= (middle ** 2), dtype=np.float32)

0 commit comments

Comments
 (0)