Skip to content

Add FlagGrouper #556

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Mar 6, 2025
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion cf_xarray/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,6 @@
from .options import set_options # noqa
from .utils import _get_version

from . import geometry # noqa
from . import geometry, groupers # noqa

__version__ = _get_version()
27 changes: 27 additions & 0 deletions cf_xarray/groupers.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
import numpy as np
import pandas as pd
from xarray.groupers import EncodedGroups, Grouper


class FlagGrouper(Grouper):
def factorize(self, group) -> EncodedGroups:
assert "flag_values" in group.attrs
assert "flag_meanings" in group.attrs

values = np.array(group.attrs["flag_values"])
full_index = pd.Index(group.attrs["flag_meanings"].split(" "))

if group.dtype.kind in "iu" and (np.diff(values) == 1).all():
# optimize
codes = group.data - values[0].astype(int)
else:
codes, _ = pd.factorize(group.data.ravel())

codes_da = group.copy(data=codes.reshape(group.shape))
codes_da.attrs.pop("flag_values")
codes_da.attrs.pop("flag_meanings")

return EncodedGroups(codes=codes_da, full_index=full_index)

def reset(self):
pass
15 changes: 15 additions & 0 deletions cf_xarray/tests/test_groupers.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
import numpy as np
from xarray.testing import assert_identical

from cf_xarray.datasets import flag_excl
from cf_xarray.groupers import FlagGrouper


def test_flag_grouper():
ds = flag_excl.to_dataset().set_coords("flag_var")
ds["foo"] = ("time", np.arange(8))
actual = ds.groupby(flag_var=FlagGrouper()).mean()
expected = ds.groupby("flag_var").mean()
expected["flag_var"] = ["flag_1", "flag_2", "flag_3"]
expected["flag_var"].attrs["standard_name"] = "flag_mutual_exclusive"
assert_identical(actual, expected)
9 changes: 8 additions & 1 deletion doc/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -21,13 +21,20 @@ Geometries
----------
.. autosummary::
:toctree: generated/

geometry.decode_geometries

geometry.encode_geometries
geometry.shapely_to_cf
geometry.cf_to_shapely
geometry.GeometryNames


Groupers
--------
.. autosummary::
:toctree: generated/
groupers.FlagGrouper

.. currentmodule:: xarray

DataArray
Expand Down
32 changes: 32 additions & 0 deletions doc/flags.md
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,38 @@ You can also check whether a DataArray has the appropriate attributes to be reco
da.cf.is_flag_variable
```

## GroupBy

Flag variables, such as that above, are naturally used for GroupBy operations.
cf-xarray provides a `FlagGrouper` that understands the `flag_meanings` and `flag_values` attributes.

Let's load an example dataset where the `flag_var` array has the needed attributes.

```{code-cell}
import cf_xarray as cfxr
import numpy as np

from cf_xarray.datasets import flag_excl

ds = flag_excl.to_dataset().set_coords('flag_var')
ds["foo"] = ("time", np.arange(8))
ds.flag_var
```

Now use the :py:class:`~cf_xarray.groupers.FlagGrouper` to group by this flag variable:

```{code-cell}
from cf_xarray.groupers import FlagGrouper

ds.groupby(flag_var=FlagGrouper()).mean()
```

Note how the output coordinate has the values from `flag_meanings`!

```{seealso}
See the Xarray docs on using [Grouper objects](https://docs.xarray.dev/en/stable/user-guide/groupby.html#grouper-objects).
```

## Flag Masks

```{warning}
Expand Down
Loading