Skip to content

Commit 4130ed7

Browse files
dcheriankeewispre-commit-ci[bot]
authored
Better forecast example (#9)
Co-authored-by: Justus Magin <keewis@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
1 parent 735f80c commit 4130ed7

File tree

1 file changed

+144
-1
lines changed

1 file changed

+144
-1
lines changed

docs/earth/forecast.md

Lines changed: 144 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,4 +7,147 @@ kernelspec:
77
name: python
88
---
99

10-
# Weather Forecasts
10+
# rolodex: Weather Forecasts
11+
12+
One way to model forecast output is a datacube with four dimensions :
13+
14+
- 2 spatial dimensions `x` and `y`,
15+
- a forecast reference time (or initialization time) commonly `time`[datetime64], and
16+
- a forecast period or "timesteps in to the future" commonly `step`[timedelta64]
17+
18+
There is one ancillary variable: 'valid time' which is the time for which the forecast is made: `valid_time = time + step`
19+
20+
```{note}
21+
Note that not all forecast model runs are run for the same length of time! We could model these missing forecasts as NaN values in the output.
22+
```
23+
24+
There are many ways one might index weather forecast output.
25+
These different ways of constructing views of a forecast data are called "Forecast Model Run Collections" (FMRC).
26+
For reference, see [this classic image](https://www.unidata.ucar.edu/presentations/caron/FmrcPoster.pdf) where the
27+
y-axis is the 'valid time', and the x-axis is the 'forecast reference or initialization time':
28+
![FMRC indexing schematic](./fmrc.png)
29+
30+
- "Model Run" : a single model run.
31+
- "Constant Offset" : all values for a given lead time.
32+
- "Constant Forecast" : all forecasts for a given time in the future.
33+
- "Best Estimate" : A best guess stitching together the analysis or initialization fields for past forecasts with the latest forecast.
34+
35+
## Highlights
36+
37+
All of these patterns are "vectorized indexing", though generating the needed indexers is complicated by missing data.
38+
39+
`ForecastIndex` encapsulates this logic and,
40+
41+
1. Demonstrates how fairly complex indexing patterns can be abstracted away with a custom Index, and
42+
1. Again illustrates the value of a custom Index in persisting "state" or extra metadata (here the type of model used).
43+
44+
## Example
45+
46+
Read an example dataset -- downward short-wave radiation flux from the NOAA's HRRR forecast system ([High Resolution Rapid Refresh](https://rapidrefresh.noaa.gov/hrrr/)).
47+
48+
```{code-cell}
49+
import rolodex.forecast
50+
import xarray as xr
51+
52+
xr.set_options(display_expand_indexes=True, display_expand_data=False)
53+
54+
cube = xr.tutorial.open_dataset("hrrr-cube")
55+
cube
56+
```
57+
58+
### Assigning
59+
60+
To assign we add a new variable: `valid_time = time + step`, "initialization time" + "time steps in to the future".
61+
62+
```{code-cell}
63+
from rolodex.forecast import Model, ForecastIndex
64+
65+
cube.coords["valid_time"] = rolodex.forecast.create_lazy_valid_time_variable(
66+
reference_time=cube.time, period=cube.step
67+
)
68+
```
69+
70+
With the `valid_time` array, we can illustrate the complex structure of this datacube.
71+
72+
- green box illustrates the bounding box of valid forecast data, any white cells within this box are NaN --- no forecast was generated for that forecast time with that initialization time.
73+
- the horizontal black line "2025-01-02 21:00" has 4 valid forecasts, though the underlying data has 23 data points (mostly NaN).
74+
75+
```{code-cell}
76+
---
77+
tags: [hide-input]
78+
---
79+
import numpy as np
80+
import matplotlib.pyplot as plt
81+
time = "2025-01-02T21:00"
82+
83+
(
84+
cube.dswrf
85+
.isel(x=10, y=20)
86+
.plot(
87+
x="time",
88+
y="valid_time",
89+
cmap="plasma",
90+
robust=True,
91+
# edgecolors="#fefefe",
92+
# linewidths=0.003,
93+
aspect=2,
94+
size=5,
95+
)
96+
)
97+
plt.axhline(np.datetime64(time), color="k")
98+
plt.plot(
99+
cube.time.data[[0, -1, -1, 0, 0]],
100+
cube.valid_time.data[
101+
[0, -1, -1, 0, 0],
102+
[0, 0, -1, -1, 0],
103+
],
104+
color="#32CD32",
105+
lw=2,
106+
)
107+
```
108+
109+
We will index out all forecasts for 2025-01-02 21:00, notice that there are 4 valid forecasts.
110+
111+
And then assign `rolodex.ForecastIndex` to `time`, `step`, `valid_time`. We pass a custom kwarg `model` to indicate the type of forecast model used, here "HRRR".
112+
113+
```{code-cell}
114+
cube = (
115+
cube.drop_indexes(["time", "step"])
116+
.set_xindex(
117+
["time", "step", "valid_time"], ForecastIndex, model=Model.HRRR
118+
)
119+
)
120+
cube
121+
```
122+
123+
### Indexing
124+
125+
The usual indexing patterns along `time` and `step` individually still work.
126+
127+
```{code-cell}
128+
cube.sel(time=slice("2025-01-01 03:00", None), step=slice("2h", "12h"))
129+
```
130+
131+
### FMRC Indexing
132+
133+
`rolodex` provides dataclasses to indicate the type of indexing requested. We illustrate one of these: `ConstantForecast` which means
134+
'grab all forecasts available for this time.'
135+
136+
```{code-cell}
137+
from rolodex.forecast import ConstantForecast
138+
```
139+
140+
These can be used with `sel` to index along `valid_time`
141+
142+
```{code-cell}
143+
subset = cube.sel(valid_time=ConstantForecast("2025-01-02T21:00"))
144+
subset
145+
```
146+
147+
On indeding, we receive 4 forecasts for this time (as expected)!
148+
149+
```{code-cell}
150+
subset.dswrf.plot(col="time", cmap="plasma");
151+
```
152+
153+
Note no NaNs or missing panels in the above output.

0 commit comments

Comments
 (0)