Skip to content

Commit 4af9372

Browse files
committed
Add pandas interval index
1 parent 54877e7 commit 4af9372

File tree

3 files changed

+97
-6
lines changed

3 files changed

+97
-6
lines changed

docs/builtin/pandas.md

Lines changed: 0 additions & 5 deletions
This file was deleted.

docs/builtin/pdinterval.md

Lines changed: 96 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,96 @@
1+
---
2+
jupytext:
3+
text_representation:
4+
format_name: myst
5+
kernelspec:
6+
display_name: Python 3
7+
name: python
8+
---
9+
10+
# pandas: IntervalIndex
11+
12+
````{grid}
13+
```{grid-item}
14+
:columns: 3
15+
```{image} https://pandas.pydata.org/docs/_static/pandas.svg
16+
---
17+
alt: Alt text
18+
width: 200px
19+
align: center
20+
---
21+
```
22+
```{grid-item}
23+
:columns: 9
24+
```{seealso}
25+
Learn more at the [Pandas](https://pandas.pydata.org/pandas-docs/stable/user_guide/advanced.html#intervalindex) documentation.
26+
```
27+
````
28+
29+
# Highlights
30+
31+
1. Xarray's built-in support for pandas Index classes extends to more sophisticated classes like {py:class}`pandas.IntervalIndex`.
32+
1. Xarray now generates such indexes automatically when using {py:meth}`xarray.DataArray.groupby_bins` or {py:meth}`xarray.Dataset.groupby_bins`.
33+
1. Sadly {py:class}`pandas.IntervalIndex` supports numpy datetimes but not cftime.
34+
35+
## Example
36+
37+
### Assigning
38+
39+
```{code-cell}
40+
%xmode minimal
41+
42+
import pandas as pd
43+
import xarray as xr
44+
45+
xr.set_options(display_expand_indexes=True, display_expand_attrs=False)
46+
pd.set_option('display.max_seq_items', 10)
47+
48+
orig = xr.tutorial.open_dataset("air_temperature")
49+
orig
50+
```
51+
52+
Let's replace the `time` vector with an IntervalIndex, assuming that the data represent averages over 6 hour periods centered at 00h, 06h, 12h, 18h
53+
54+
```{code-cell}
55+
left = orig.time.data - pd.Timedelta("3h")
56+
right = orig.time.data + pd.Timedelta("3h")
57+
time_bounds = pd.IntervalIndex.from_arrays(left, right, closed="left")
58+
time_bounds
59+
```
60+
61+
```{code-cell}
62+
indexed = orig.copy(deep=True)
63+
indexed["time"] = time_bounds
64+
indexed
65+
```
66+
67+
### Indexing
68+
69+
Let's index out a representative value for 2013-05-01 02:00.
70+
71+
```{code-cell}
72+
---
73+
tags: [raises-exception]
74+
---
75+
orig.sel(time="2013-05-01 02:00")
76+
```
77+
78+
Indexing the original dataset required specifying `method="nearest"`
79+
80+
```{code-cell}
81+
orig.sel(time="2013-05-01 02:00", method="nearest").time
82+
```
83+
84+
With an IntervalIndex, however, that is unnecessary
85+
86+
```{code-cell}
87+
indexed.sel(time="2013-05-01 02:00").time
88+
```
89+
90+
### Binned grouping
91+
92+
Xarray now creates IntervalIndex by default for binned grouping operations
93+
94+
```{code-cell}
95+
orig.groupby_bins("lat", bins=[25, 35, 45, 55]).mean()
96+
```

docs/index.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ caption: Built-in
1010
hidden:
1111
---
1212
builtin/range
13-
builtin/pandas
13+
builtin/pdinterval
1414
```
1515

1616
```{toctree}

0 commit comments

Comments
 (0)