Add machine learning vs pysteps guidelines (#160)

loforest · dnerini · aperezhortal · web-flow · commit f603161358c0 · 2020-07-16T22:05:16.000+02:00
* Add machine learning vs pysteps guidelines

* Update index

* Add machine learning references

* Add link to papers

* Format page number as string

* Update doc/source/user_guide/machine_learning_pysteps.rst

Co-authored-by: Daniele Nerini &lt;daniele.nerini@meteoswiss.ch&gt;

* Update doc/source/user_guide/machine_learning_pysteps.rst

Co-authored-by: Daniele Nerini &lt;daniele.nerini@meteoswiss.ch&gt;

* Update doc/source/user_guide/machine_learning_pysteps.rst

Co-authored-by: Daniele Nerini &lt;daniele.nerini@meteoswiss.ch&gt;

* Update doc/source/user_guide/machine_learning_pysteps.rst

Co-authored-by: Daniele Nerini &lt;daniele.nerini@meteoswiss.ch&gt;

* Update doc/source/user_guide/machine_learning_pysteps.rst

Co-authored-by: Daniele Nerini &lt;daniele.nerini@meteoswiss.ch&gt;

* Update doc/source/user_guide/machine_learning_pysteps.rst

Co-authored-by: Daniele Nerini &lt;daniele.nerini@meteoswiss.ch&gt;

* Update doc/source/user_guide/machine_learning_pysteps.rst

Co-authored-by: Daniele Nerini &lt;daniele.nerini@meteoswiss.ch&gt;

* update title

* Update doc/source/user_guide/machine_learning_pysteps.rst

Co-authored-by: Daniele Nerini &lt;daniele.nerini@meteoswiss.ch&gt;

* Update doc/source/user_guide/machine_learning_pysteps.rst

Co-authored-by: Daniele Nerini &lt;daniele.nerini@meteoswiss.ch&gt;

* Update machine_learning_pysteps.rst

* override wide tables in RDT theme

* add pysteps css

* comment out table text wrapping attempt

* add SRGAN reference

* Add precommit configuration

* Add references for pysteps's modules and functions

* add correspondence to NWP forecast types

Co-authored-by: Daniele Nerini &lt;daniele.nerini@meteoswiss.ch&gt;
Co-authored-by: Andres Perez Hortal &lt;andresperezcba@gmail.com&gt;
diff --git a/doc/source/conf.py b/doc/source/conf.py
@@ -145,6 +145,7 @@ def set_root():
 # relative to this directory. They are copied after the builtin static files,
 # so a file named "default.css" will overwrite the builtin "default.css".
 html_static_path = ["../_static"]
+html_css_files = ["../_static/pysteps.css"]
 
 # Custom sidebar templates, must be a dictionary that maps document names
 # to template names.
diff --git a/doc/source/references.bib b/doc/source/references.bib
@@ -52,6 +52,27 @@ @ARTICLE{EWWM2013
   DOI = "10.1002/met.1392"
 }
 
+@ARTICLE{FSNBG2019,
+  AUTHOR = "Foresti, L. and Sideris, I.V. and Nerini, D. and Beusch, L. and Germann, U.",
+  TITLE = "Using a 10-Year Radar Archive for Nowcasting Precipitation Growth and Decay: A Probabilistic Machine Learning Approach",
+  JOURNAL = "Weather and Forecasting",
+  VOLUME = 34,
+  PAGES = "1547--1569",
+  YEAR = 2019,
+  DOI = "10.1175/WAF-D-18-0206.1"
+}
+
+@ARTICLE{FNPC2020,
+  AUTHOR = "Franch, G. and Nerini, D. and Pendesini, M. and Coviello, L. and Jurman, G. and Furlanello, C.",
+  TITLE = "Precipitation Nowcasting with Orographic Enhanced Stacked Generalization: Improving Deep Learning Predictions on Extreme Events",
+  JOURNAL = "Atmosphere",
+  VOLUME = 11,
+  NUMBER = 3,
+  PAGES = "267",
+  YEAR = 2020,
+  DOI = "10.3390/atmos11030267"
+}
+
 @ARTICLE{GZ2002,
   AUTHOR = "U. Germann and I. Zawadzki",
   TITLE = "Scale-Dependence of the Predictability of Precipitation from Continental Radar Images. {P}art {I}: Description of the Methodology",
diff --git a/doc/source/user_guide/index.rst b/doc/source/user_guide/index.rst
@@ -17,4 +17,5 @@ the package, see the :ref:`pysteps-reference`.
     example_data
     set_pystepsrc
     ../auto_examples/index
+    machine_learning_pysteps
 
diff --git a/doc/source/user_guide/machine_learning_pysteps.rst b/doc/source/user_guide/machine_learning_pysteps.rst
@@ -0,0 +1,103 @@
+.. _machine_learning_pysteps:
+
+Benchmarking machine learning models with pysteps
+=================================================
+How to correctly compare the accuracy of machine learning against traditional nowcasting methods available in pysteps?
+
+Before starting the comparison, you need to ask yourself what is the objective of nowcasting:
+
+#. Do you only want to minimize prediction errors?
+#. Do you also want to represent the prediction uncertainty? 
+
+To achieve objective 1, it is sufficient to produce a single deterministic nowcast that filters out the unpredictable small-scale precipitation features.
+However, this will create a nowcast that will become increasingly smooth over time.
+
+To achieve objective 2, you need to produce a probabilistic or an ensemble nowcast (several ensemble members and realizations).
+
+In weather forecasting (and nowcasting) we usually want to achieve both goals because it is impossible to predict the evolution of a chaotic system with 100% accuracy, especially space-time precipitation fields and thunderstorms! 
+
+Machine learning and pysteps offer several methods to produce both deterministic and probabilistic nowcasts. 
+Therefore, if you want to compare machine learning-based nowcasts to simpler extrapolation-based models, you need to select the right method.
+
+1. Deterministic nowcasting
+--------------------------------------------
+
+Deterministic nowcasts can be divided into:
+
+a. Variance-preserving nowcasts, such as extrapolation nowcasts by Eulerian and Lagrangian persistence.
+b. Error-minimization nowcasts, such as machine learning, Fourier-filtered and ensemble mean nowcasts.
+
+**Very important**: these two types of deterministic nowcasts are not directly comparable because they have a different variance! 
+This is best explained by the decomposition of the mean squared error (MSE):
+
+:math:`MSE = bias^2 + Var`
+
+All deterministic machine learning algorithms that minimize the MSE (or a variation) will also inevitably minimize the variance of nowcast fields.
+This is a natural attempt to filter out the unpredictable precipitation features, which would otherwise increase the variance (and the MSE).
+The same principle holds for convolutional and/or deep neural network architectures, which also produce smooth nowcasts.
+
+Therefore, it is better to avoid directly comparing an error-minimization machine learning nowcast to a variance-preserving radar extrapolation, as produced by the module :py:mod:`pysteps.nowcasts.extrapolation`. Instead, you should use the pysteps ensemble mean.
+
+A deterministic equivalent of the ensemble mean can be approximated using the modules :py:mod:`pysteps.nowcasts.sprog` or :py:mod:`pysteps.nowcasts.anvil`.
+Another possibility, but more computationally demanding, is to average many ensemble members generated by the modules :py:mod:`pysteps.nowcasts.steps` or :py:mod:`pysteps.nowcasts.sseps`.
+
+Still, even by using the pysteps ensemble mean, it is not given that its variance will be the same as the one of machine learning predictions. 
+Possible solutions are to:
+
+#. use a normalized MSE (NMSE) or another score accounting for differences in the variance between prediction and observation.
+#. decompose the field with a Fourier (or wavelet) transform to compare features at the same spatial scales.
+
+A good deterministic comparison of a deep convolutional machine learning neural network nowcast and pysteps is given in :cite:`FNPC2020`.
+
+2. Probabilistic nowcasting
+--------------------------------------------
+
+Probabilistic machine learning regression methods can be roughly categorized into:
+
+a. Quantile-based methods, such as quantile regression, quantile random forests and quantile neural networks.
+b. Ensemble-based methods, such as generative adversarial networks (GANs) and variational auto-encoders (VAEs).
+
+Quantile-based machine learning nowcasts are interesting, but can only estimate the probability of exceedance at a given point (see e.g. :cite:`FSNBG2019`).
+
+To estimate areal exceedance probabilities, for example above catchments, or to propagate the nowcast uncertainty into hydrological models, the full ensemble still needs to be generated, e.g. with generative machine learning models.
+
+Generative machine learning methods are similar to the pysteps ensemble members. Both are designed to produce an ensemble of possible realizations that preserve the variance of observed radar fields.
+
+A proper probabilistic verification of generative machine learning models against pysteps would be an interesting research direction.
+
+Summary
+-------
+The table below is an attempt to classify machine learning and pysteps nowcasting methods according to the four main prediction types:
+
+#. Deterministic (variance-preserving), like one control NWP forecast
+#. Deterministic (error-minimization), like an ensemble mean NWP forecast
+#. Probabilistic (quantile-based), like a probabilistic NWP forecast (without members)
+#. Probabilistic (ensemble-based), like the members of an ensemble NWP forecast
+
+The comparison of methods from different types should only be done carefully and with good reasons.
+
+.. list-table::
+   :widths: 30 20 20 20
+   :header-rows: 1
+
+   * - Nowcast type
+     - Machine learning
+     - Pysteps
+     - Verification
+   * - Deterministic (variance-preserving)
+     - SRGAN (Wang et al., 2018), Others?
+     - :py:mod:`pysteps.nowcasts.extrapolation` (any optical flow method)
+     - MSE, RMSE, MAE, ETS, etc
+   * - Deterministic (error-minimization)
+     - Classical ANNs, (deep) CNNs, random forests, AdaBoost, etc
+     - :py:mod:`pysteps.nowcasts.sprog`, :py:mod:`pysteps.nowcasts.anvil` or ensemble mean of :py:mod:`pysteps.nowcasts.steps`/:py:mod:`~pysteps.nowcasts.sseps`
+     - MSE, RMSE, MAE, ETS, etc or better normalized scores, etc
+   * - Probabilistic (quantile-based)
+     - Quantile ANN, quantile random forests, quantile regression
+     - Probabilities derived from :py:mod:`pysteps.nowcasts.steps`/:py:mod:`~pysteps.nowcasts.sseps`
+     - Reliability diagram (predicted vs observed quantile), probability integral transform (PIT) histogram
+   * - Probabilistic (ensemble-based)
+     - GANs, VAEs, etc
+     - Ensemble and probabilities derived from :py:mod:`pysteps.nowcasts.steps`/:py:mod:`~pysteps.nowcasts.sseps`
+     - Probabilistic verification: reliability diagrams, continuous ranked probability scores (CRPS), etc. 
+       Ensemble verification: rank histograms, spread-error relationships, etc