Skip to content

Commit f603161

Browse files
loforestdneriniaperezhortal
authored
Add machine learning vs pysteps guidelines (#160)
* Add machine learning vs pysteps guidelines * Update index * Add machine learning references * Add link to papers * Format page number as string * Update doc/source/user_guide/machine_learning_pysteps.rst Co-authored-by: Daniele Nerini <daniele.nerini@meteoswiss.ch> * Update doc/source/user_guide/machine_learning_pysteps.rst Co-authored-by: Daniele Nerini <daniele.nerini@meteoswiss.ch> * Update doc/source/user_guide/machine_learning_pysteps.rst Co-authored-by: Daniele Nerini <daniele.nerini@meteoswiss.ch> * Update doc/source/user_guide/machine_learning_pysteps.rst Co-authored-by: Daniele Nerini <daniele.nerini@meteoswiss.ch> * Update doc/source/user_guide/machine_learning_pysteps.rst Co-authored-by: Daniele Nerini <daniele.nerini@meteoswiss.ch> * Update doc/source/user_guide/machine_learning_pysteps.rst Co-authored-by: Daniele Nerini <daniele.nerini@meteoswiss.ch> * Update doc/source/user_guide/machine_learning_pysteps.rst Co-authored-by: Daniele Nerini <daniele.nerini@meteoswiss.ch> * update title * Update doc/source/user_guide/machine_learning_pysteps.rst Co-authored-by: Daniele Nerini <daniele.nerini@meteoswiss.ch> * Update doc/source/user_guide/machine_learning_pysteps.rst Co-authored-by: Daniele Nerini <daniele.nerini@meteoswiss.ch> * Update machine_learning_pysteps.rst * override wide tables in RDT theme * add pysteps css * comment out table text wrapping attempt * add SRGAN reference * Add precommit configuration * Add references for pysteps's modules and functions * add correspondence to NWP forecast types Co-authored-by: Daniele Nerini <daniele.nerini@meteoswiss.ch> Co-authored-by: Andres Perez Hortal <andresperezcba@gmail.com>
1 parent 275ebcb commit f603161

File tree

4 files changed

+126
-0
lines changed

4 files changed

+126
-0
lines changed

doc/source/conf.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -145,6 +145,7 @@ def set_root():
145145
# relative to this directory. They are copied after the builtin static files,
146146
# so a file named "default.css" will overwrite the builtin "default.css".
147147
html_static_path = ["../_static"]
148+
html_css_files = ["../_static/pysteps.css"]
148149

149150
# Custom sidebar templates, must be a dictionary that maps document names
150151
# to template names.

doc/source/references.bib

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -52,6 +52,27 @@ @ARTICLE{EWWM2013
5252
DOI = "10.1002/met.1392"
5353
}
5454

55+
@ARTICLE{FSNBG2019,
56+
AUTHOR = "Foresti, L. and Sideris, I.V. and Nerini, D. and Beusch, L. and Germann, U.",
57+
TITLE = "Using a 10-Year Radar Archive for Nowcasting Precipitation Growth and Decay: A Probabilistic Machine Learning Approach",
58+
JOURNAL = "Weather and Forecasting",
59+
VOLUME = 34,
60+
PAGES = "1547--1569",
61+
YEAR = 2019,
62+
DOI = "10.1175/WAF-D-18-0206.1"
63+
}
64+
65+
@ARTICLE{FNPC2020,
66+
AUTHOR = "Franch, G. and Nerini, D. and Pendesini, M. and Coviello, L. and Jurman, G. and Furlanello, C.",
67+
TITLE = "Precipitation Nowcasting with Orographic Enhanced Stacked Generalization: Improving Deep Learning Predictions on Extreme Events",
68+
JOURNAL = "Atmosphere",
69+
VOLUME = 11,
70+
NUMBER = 3,
71+
PAGES = "267",
72+
YEAR = 2020,
73+
DOI = "10.3390/atmos11030267"
74+
}
75+
5576
@ARTICLE{GZ2002,
5677
AUTHOR = "U. Germann and I. Zawadzki",
5778
TITLE = "Scale-Dependence of the Predictability of Precipitation from Continental Radar Images. {P}art {I}: Description of the Methodology",

doc/source/user_guide/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,4 +17,5 @@ the package, see the :ref:`pysteps-reference`.
1717
example_data
1818
set_pystepsrc
1919
../auto_examples/index
20+
machine_learning_pysteps
2021

Lines changed: 103 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,103 @@
1+
.. _machine_learning_pysteps:
2+
3+
Benchmarking machine learning models with pysteps
4+
=================================================
5+
How to correctly compare the accuracy of machine learning against traditional nowcasting methods available in pysteps?
6+
7+
Before starting the comparison, you need to ask yourself what is the objective of nowcasting:
8+
9+
#. Do you only want to minimize prediction errors?
10+
#. Do you also want to represent the prediction uncertainty?
11+
12+
To achieve objective 1, it is sufficient to produce a single deterministic nowcast that filters out the unpredictable small-scale precipitation features.
13+
However, this will create a nowcast that will become increasingly smooth over time.
14+
15+
To achieve objective 2, you need to produce a probabilistic or an ensemble nowcast (several ensemble members and realizations).
16+
17+
In weather forecasting (and nowcasting) we usually want to achieve both goals because it is impossible to predict the evolution of a chaotic system with 100% accuracy, especially space-time precipitation fields and thunderstorms!
18+
19+
Machine learning and pysteps offer several methods to produce both deterministic and probabilistic nowcasts.
20+
Therefore, if you want to compare machine learning-based nowcasts to simpler extrapolation-based models, you need to select the right method.
21+
22+
1. Deterministic nowcasting
23+
--------------------------------------------
24+
25+
Deterministic nowcasts can be divided into:
26+
27+
a. Variance-preserving nowcasts, such as extrapolation nowcasts by Eulerian and Lagrangian persistence.
28+
b. Error-minimization nowcasts, such as machine learning, Fourier-filtered and ensemble mean nowcasts.
29+
30+
**Very important**: these two types of deterministic nowcasts are not directly comparable because they have a different variance!
31+
This is best explained by the decomposition of the mean squared error (MSE):
32+
33+
:math:`MSE = bias^2 + Var`
34+
35+
All deterministic machine learning algorithms that minimize the MSE (or a variation) will also inevitably minimize the variance of nowcast fields.
36+
This is a natural attempt to filter out the unpredictable precipitation features, which would otherwise increase the variance (and the MSE).
37+
The same principle holds for convolutional and/or deep neural network architectures, which also produce smooth nowcasts.
38+
39+
Therefore, it is better to avoid directly comparing an error-minimization machine learning nowcast to a variance-preserving radar extrapolation, as produced by the module :py:mod:`pysteps.nowcasts.extrapolation`. Instead, you should use the pysteps ensemble mean.
40+
41+
A deterministic equivalent of the ensemble mean can be approximated using the modules :py:mod:`pysteps.nowcasts.sprog` or :py:mod:`pysteps.nowcasts.anvil`.
42+
Another possibility, but more computationally demanding, is to average many ensemble members generated by the modules :py:mod:`pysteps.nowcasts.steps` or :py:mod:`pysteps.nowcasts.sseps`.
43+
44+
Still, even by using the pysteps ensemble mean, it is not given that its variance will be the same as the one of machine learning predictions.
45+
Possible solutions are to:
46+
47+
#. use a normalized MSE (NMSE) or another score accounting for differences in the variance between prediction and observation.
48+
#. decompose the field with a Fourier (or wavelet) transform to compare features at the same spatial scales.
49+
50+
A good deterministic comparison of a deep convolutional machine learning neural network nowcast and pysteps is given in :cite:`FNPC2020`.
51+
52+
2. Probabilistic nowcasting
53+
--------------------------------------------
54+
55+
Probabilistic machine learning regression methods can be roughly categorized into:
56+
57+
a. Quantile-based methods, such as quantile regression, quantile random forests and quantile neural networks.
58+
b. Ensemble-based methods, such as generative adversarial networks (GANs) and variational auto-encoders (VAEs).
59+
60+
Quantile-based machine learning nowcasts are interesting, but can only estimate the probability of exceedance at a given point (see e.g. :cite:`FSNBG2019`).
61+
62+
To estimate areal exceedance probabilities, for example above catchments, or to propagate the nowcast uncertainty into hydrological models, the full ensemble still needs to be generated, e.g. with generative machine learning models.
63+
64+
Generative machine learning methods are similar to the pysteps ensemble members. Both are designed to produce an ensemble of possible realizations that preserve the variance of observed radar fields.
65+
66+
A proper probabilistic verification of generative machine learning models against pysteps would be an interesting research direction.
67+
68+
Summary
69+
-------
70+
The table below is an attempt to classify machine learning and pysteps nowcasting methods according to the four main prediction types:
71+
72+
#. Deterministic (variance-preserving), like one control NWP forecast
73+
#. Deterministic (error-minimization), like an ensemble mean NWP forecast
74+
#. Probabilistic (quantile-based), like a probabilistic NWP forecast (without members)
75+
#. Probabilistic (ensemble-based), like the members of an ensemble NWP forecast
76+
77+
The comparison of methods from different types should only be done carefully and with good reasons.
78+
79+
.. list-table::
80+
:widths: 30 20 20 20
81+
:header-rows: 1
82+
83+
* - Nowcast type
84+
- Machine learning
85+
- Pysteps
86+
- Verification
87+
* - Deterministic (variance-preserving)
88+
- SRGAN (Wang et al., 2018), Others?
89+
- :py:mod:`pysteps.nowcasts.extrapolation` (any optical flow method)
90+
- MSE, RMSE, MAE, ETS, etc
91+
* - Deterministic (error-minimization)
92+
- Classical ANNs, (deep) CNNs, random forests, AdaBoost, etc
93+
- :py:mod:`pysteps.nowcasts.sprog`, :py:mod:`pysteps.nowcasts.anvil` or ensemble mean of :py:mod:`pysteps.nowcasts.steps`/:py:mod:`~pysteps.nowcasts.sseps`
94+
- MSE, RMSE, MAE, ETS, etc or better normalized scores, etc
95+
* - Probabilistic (quantile-based)
96+
- Quantile ANN, quantile random forests, quantile regression
97+
- Probabilities derived from :py:mod:`pysteps.nowcasts.steps`/:py:mod:`~pysteps.nowcasts.sseps`
98+
- Reliability diagram (predicted vs observed quantile), probability integral transform (PIT) histogram
99+
* - Probabilistic (ensemble-based)
100+
- GANs, VAEs, etc
101+
- Ensemble and probabilities derived from :py:mod:`pysteps.nowcasts.steps`/:py:mod:`~pysteps.nowcasts.sseps`
102+
- Probabilistic verification: reliability diagrams, continuous ranked probability scores (CRPS), etc.
103+
Ensemble verification: rank histograms, spread-error relationships, etc

0 commit comments

Comments
 (0)