Skip to content

Commit 4399d96

Browse files
committed
Merge branch 'main' into state-machine
* main: (26 commits) [pre-commit.ci] pre-commit autoupdate (pydata#8900) Bump the actions group with 1 update (pydata#8896) New empty whatsnew entry (pydata#8899) Update reference to 'Weighted quantile estimators' (pydata#8898) 2024.03.0: Add whats-new (pydata#8891) Add typing to test_groupby.py (pydata#8890) Avoid in-place multiplication of a large value to an array with small integer dtype (pydata#8867) Check for aligned chunks when writing to existing variables (pydata#8459) Add dt.date to plottable types (pydata#8873) Optimize writes to existing Zarr stores. (pydata#8875) Allow multidimensional variable with same name as dim when constructing dataset via coords (pydata#8886) Don't allow overwriting indexes with region writes (pydata#8877) Migrate datatree.py module into xarray.core. (pydata#8789) warn and return bytes undecoded in case of UnicodeDecodeError in h5netcdf-backend (pydata#8874) groupby: Dispatch quantile to flox. (pydata#8720) Opt out of auto creating index variables (pydata#8711) Update docs on view / copies (pydata#8744) Handle .oindex and .vindex for the PandasMultiIndexingAdapter and PandasIndexingAdapter (pydata#8869) numpy 2.0 copy-keyword and trapz vs trapezoid (pydata#8865) upstream-dev CI: Fix interp and cumtrapz (pydata#8861) ...
2 parents a216531 + 97d3a3a commit 4399d96

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

60 files changed

+1341
-881
lines changed

.github/workflows/ci-additional.yaml

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -127,7 +127,7 @@ jobs:
127127
python -m mypy --install-types --non-interactive --cobertura-xml-report mypy_report xarray/
128128
129129
- name: Upload mypy coverage to Codecov
130-
uses: codecov/codecov-action@v4.1.0
130+
uses: codecov/codecov-action@v4.1.1
131131
with:
132132
file: mypy_report/cobertura.xml
133133
flags: mypy
@@ -181,7 +181,7 @@ jobs:
181181
python -m mypy --install-types --non-interactive --cobertura-xml-report mypy_report xarray/
182182
183183
- name: Upload mypy coverage to Codecov
184-
uses: codecov/codecov-action@v4.1.0
184+
uses: codecov/codecov-action@v4.1.1
185185
with:
186186
file: mypy_report/cobertura.xml
187187
flags: mypy39
@@ -242,7 +242,7 @@ jobs:
242242
python -m pyright xarray/
243243
244244
- name: Upload pyright coverage to Codecov
245-
uses: codecov/codecov-action@v4.1.0
245+
uses: codecov/codecov-action@v4.1.1
246246
with:
247247
file: pyright_report/cobertura.xml
248248
flags: pyright
@@ -301,7 +301,7 @@ jobs:
301301
python -m pyright xarray/
302302
303303
- name: Upload pyright coverage to Codecov
304-
uses: codecov/codecov-action@v4.1.0
304+
uses: codecov/codecov-action@v4.1.1
305305
with:
306306
file: pyright_report/cobertura.xml
307307
flags: pyright39

.github/workflows/ci.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -151,7 +151,7 @@ jobs:
151151
path: pytest.xml
152152

153153
- name: Upload code coverage to Codecov
154-
uses: codecov/codecov-action@v4.1.0
154+
uses: codecov/codecov-action@v4.1.1
155155
with:
156156
file: ./coverage.xml
157157
flags: unittests

.github/workflows/upstream-dev-ci.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -143,7 +143,7 @@ jobs:
143143
run: |
144144
python -m mypy --install-types --non-interactive --cobertura-xml-report mypy_report
145145
- name: Upload mypy coverage to Codecov
146-
uses: codecov/codecov-action@v4.1.0
146+
uses: codecov/codecov-action@v4.1.1
147147
with:
148148
file: mypy_report/cobertura.xml
149149
flags: mypy

.pre-commit-config.yaml

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -13,24 +13,24 @@ repos:
1313
- id: mixed-line-ending
1414
- repo: https://github.com/astral-sh/ruff-pre-commit
1515
# Ruff version.
16-
rev: 'v0.2.0'
16+
rev: 'v0.3.4'
1717
hooks:
1818
- id: ruff
1919
args: ["--fix", "--show-fixes"]
2020
# https://github.com/python/black#version-control-integration
2121
- repo: https://github.com/psf/black-pre-commit-mirror
22-
rev: 24.1.1
22+
rev: 24.3.0
2323
hooks:
2424
- id: black-jupyter
2525
- repo: https://github.com/keewis/blackdoc
2626
rev: v0.3.9
2727
hooks:
2828
- id: blackdoc
2929
exclude: "generate_aggregations.py"
30-
additional_dependencies: ["black==24.1.1"]
30+
additional_dependencies: ["black==24.3.0"]
3131
- id: blackdoc-autoupdate-black
3232
- repo: https://github.com/pre-commit/mirrors-mypy
33-
rev: v1.8.0
33+
rev: v1.9.0
3434
hooks:
3535
- id: mypy
3636
# Copied from setup.cfg

doc/user-guide/indexing.rst

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -748,7 +748,7 @@ Whether array indexing returns a view or a copy of the underlying
748748
data depends on the nature of the labels.
749749

750750
For positional (integer)
751-
indexing, xarray follows the same rules as NumPy:
751+
indexing, xarray follows the same `rules`_ as NumPy:
752752

753753
* Positional indexing with only integers and slices returns a view.
754754
* Positional indexing with arrays or lists returns a copy.
@@ -765,8 +765,10 @@ Whether data is a copy or a view is more predictable in xarray than in pandas, s
765765
unlike pandas, xarray does not produce `SettingWithCopy warnings`_. However, you
766766
should still avoid assignment with chained indexing.
767767

768-
.. _SettingWithCopy warnings: https://pandas.pydata.org/pandas-docs/stable/indexing.html#returning-a-view-versus-a-copy
768+
Note that other operations (such as :py:meth:`~xarray.DataArray.values`) may also return views rather than copies.
769769

770+
.. _SettingWithCopy warnings: https://pandas.pydata.org/pandas-docs/stable/indexing.html#returning-a-view-versus-a-copy
771+
.. _rules: https://numpy.org/doc/stable/user/basics.copies.html
770772

771773
.. _multi-level indexing:
772774

doc/whats-new.rst

Lines changed: 58 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -15,32 +15,65 @@ What's New
1515
np.random.seed(123456)
1616
1717
18-
.. _whats-new.2024.03.0:
18+
.. _whats-new.2024.04.0:
1919

20-
v2024.03.0 (unreleased)
20+
v2024.04.0 (unreleased)
2121
-----------------------
2222

2323
New Features
2424
~~~~~~~~~~~~
2525

26+
27+
Breaking changes
28+
~~~~~~~~~~~~~~~~
29+
30+
31+
Bug fixes
32+
~~~~~~~~~
33+
34+
35+
Internal Changes
36+
~~~~~~~~~~~~~~~~
37+
38+
39+
.. _whats-new.2024.03.0:
40+
41+
v2024.03.0 (Mar 29, 2024)
42+
-------------------------
43+
44+
This release brings performance improvements for grouped and resampled quantile calculations, CF decoding improvements,
45+
minor optimizations to distributed Zarr writes, and compatibility fixes for Numpy 2.0 and Pandas 3.0.
46+
47+
Thanks to the 18 contributors to this release:
48+
Anderson Banihirwe, Christoph Hasse, Deepak Cherian, Etienne Schalk, Justus Magin, Kai Mühlbauer, Kevin Schwarzwald, Mark Harfouche, Martin, Matt Savoie, Maximilian Roos, Ray Bell, Roberto Chang, Spencer Clark, Tom Nicholas, crusaderky, owenlittlejohns, saschahofmann
49+
50+
New Features
51+
~~~~~~~~~~~~
52+
- Partial writes to existing chunks with ``region`` or ``append_dim`` will now raise an error
53+
(unless ``safe_chunks=False``); previously an error would only be raised on
54+
new variables. (:pull:`8459`, :issue:`8371`, :issue:`8882`)
55+
By `Maximilian Roos <https://github.com/max-sixty>`_.
56+
- Grouped and resampling quantile calculations now use the vectorized algorithm in ``flox>=0.9.4`` if present.
57+
By `Deepak Cherian <https://github.com/dcherian>`_.
2658
- Do not broadcast in arithmetic operations when global option ``arithmetic_broadcast=False``
2759
(:issue:`6806`, :pull:`8784`).
2860
By `Etienne Schalk <https://github.com/etienneschalk>`_ and `Deepak Cherian <https://github.com/dcherian>`_.
2961
- Add the ``.oindex`` property to Explicitly Indexed Arrays for orthogonal indexing functionality. (:issue:`8238`, :pull:`8750`)
3062
By `Anderson Banihirwe <https://github.com/andersy005>`_.
31-
3263
- Add the ``.vindex`` property to Explicitly Indexed Arrays for vectorized indexing functionality. (:issue:`8238`, :pull:`8780`)
3364
By `Anderson Banihirwe <https://github.com/andersy005>`_.
34-
3565
- Expand use of ``.oindex`` and ``.vindex`` properties. (:pull: `8790`)
3666
By `Anderson Banihirwe <https://github.com/andersy005>`_ and `Deepak Cherian <https://github.com/dcherian>`_.
67+
- Allow creating :py:class:`xr.Coordinates` objects with no indexes (:pull:`8711`)
68+
By `Benoit Bovy <https://github.com/benbovy>`_ and `Tom Nicholas
69+
<https://github.com/TomNicholas>`_.
70+
- Enable plotting of ``datetime.dates``. (:issue:`8866`, :pull:`8873`)
71+
By `Sascha Hofmann <https://github.com/saschahofmann>`_.
3772

3873
Breaking changes
3974
~~~~~~~~~~~~~~~~
40-
41-
42-
Deprecations
43-
~~~~~~~~~~~~
75+
- Don't allow overwriting index variables with ``to_zarr`` region writes. (:issue:`8589`, :pull:`8876`).
76+
By `Deepak Cherian <https://github.com/dcherian>`_.
4477

4578

4679
Bug fixes
@@ -57,16 +90,29 @@ Bug fixes
5790
`CFMaskCoder`/`CFScaleOffsetCoder` (:issue:`2304`, :issue:`5597`,
5891
:issue:`7691`, :pull:`8713`, see also discussion in :pull:`7654`).
5992
By `Kai Mühlbauer <https://github.com/kmuehlbauer>`_.
60-
61-
Documentation
62-
~~~~~~~~~~~~~
63-
93+
- Do not cast `_FillValue`/`missing_value` in `CFMaskCoder` if `_Unsigned` is provided
94+
(:issue:`8844`, :pull:`8852`).
95+
- Adapt handling of copy keyword argument for numpy >= 2.0dev
96+
(:issue:`8844`, :pull:`8851`, :pull:`8865`).
97+
By `Kai Mühlbauer <https://github.com/kmuehlbauer>`_.
98+
- Import trapz/trapezoid depending on numpy version
99+
(:issue:`8844`, :pull:`8865`).
100+
By `Kai Mühlbauer <https://github.com/kmuehlbauer>`_.
101+
- Warn and return bytes undecoded in case of UnicodeDecodeError in h5netcdf-backend
102+
(:issue:`5563`, :pull:`8874`).
103+
By `Kai Mühlbauer <https://github.com/kmuehlbauer>`_.
104+
- Fix bug incorrectly disallowing creation of a dataset with a multidimensional coordinate variable with the same name as one of its dims.
105+
(:issue:`8884`, :pull:`8886`)
106+
By `Tom Nicholas <https://github.com/TomNicholas>`_.
64107

65108
Internal Changes
66109
~~~~~~~~~~~~~~~~
67110
- Migrates ``treenode`` functionality into ``xarray/core`` (:pull:`8757`)
68111
By `Matt Savoie <https://github.com/flamingbear>`_ and `Tom Nicholas
69112
<https://github.com/TomNicholas>`_.
113+
- Migrates ``datatree`` functionality into ``xarray/core``. (:pull: `8789`)
114+
By `Owen Littlejohns <https://github.com/owenlittlejohns>`_, `Matt Savoie
115+
<https://github.com/flamingbear>`_ and `Tom Nicholas <https://github.com/TomNicholas>`_.
70116

71117

72118
.. _whats-new.2024.02.0:

pyproject.toml

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -171,7 +171,6 @@ module = [
171171
"xarray.tests.test_dask",
172172
"xarray.tests.test_dataarray",
173173
"xarray.tests.test_duck_array_ops",
174-
"xarray.tests.test_groupby",
175174
"xarray.tests.test_indexing",
176175
"xarray.tests.test_merge",
177176
"xarray.tests.test_missing",

xarray/backends/api.py

Lines changed: 6 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -69,7 +69,7 @@
6969
T_NetcdfTypes = Literal[
7070
"NETCDF4", "NETCDF4_CLASSIC", "NETCDF3_64BIT", "NETCDF3_CLASSIC"
7171
]
72-
from xarray.datatree_.datatree import DataTree
72+
from xarray.core.datatree import DataTree
7373

7474
DATAARRAY_NAME = "__xarray_dataarray_name__"
7575
DATAARRAY_VARIABLE = "__xarray_dataarray_variable__"
@@ -1562,24 +1562,19 @@ def _auto_detect_regions(ds, region, open_kwargs):
15621562
return region
15631563

15641564

1565-
def _validate_and_autodetect_region(
1566-
ds, region, mode, open_kwargs
1567-
) -> tuple[dict[str, slice], bool]:
1565+
def _validate_and_autodetect_region(ds, region, mode, open_kwargs) -> dict[str, slice]:
15681566
if region == "auto":
15691567
region = {dim: "auto" for dim in ds.dims}
15701568

15711569
if not isinstance(region, dict):
15721570
raise TypeError(f"``region`` must be a dict, got {type(region)}")
15731571

15741572
if any(v == "auto" for v in region.values()):
1575-
region_was_autodetected = True
15761573
if mode != "r+":
15771574
raise ValueError(
15781575
f"``mode`` must be 'r+' when using ``region='auto'``, got {mode}"
15791576
)
15801577
region = _auto_detect_regions(ds, region, open_kwargs)
1581-
else:
1582-
region_was_autodetected = False
15831578

15841579
for k, v in region.items():
15851580
if k not in ds.dims:
@@ -1612,7 +1607,7 @@ def _validate_and_autodetect_region(
16121607
f".drop_vars({non_matching_vars!r})"
16131608
)
16141609

1615-
return region, region_was_autodetected
1610+
return region
16161611

16171612

16181613
def _validate_datatypes_for_zarr_append(zstore, dataset):
@@ -1784,12 +1779,9 @@ def to_zarr(
17841779
storage_options=storage_options,
17851780
zarr_version=zarr_version,
17861781
)
1787-
region, region_was_autodetected = _validate_and_autodetect_region(
1788-
dataset, region, mode, open_kwargs
1789-
)
1790-
# drop indices to avoid potential race condition with auto region
1791-
if region_was_autodetected:
1792-
dataset = dataset.drop_vars(dataset.indexes)
1782+
region = _validate_and_autodetect_region(dataset, region, mode, open_kwargs)
1783+
# can't modify indexed with region writes
1784+
dataset = dataset.drop_vars(dataset.indexes)
17931785
if append_dim is not None and append_dim in region:
17941786
raise ValueError(
17951787
f"cannot list the same dimension in both ``append_dim`` and "

xarray/backends/common.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -23,8 +23,8 @@
2323
from netCDF4 import Dataset as ncDataset
2424

2525
from xarray.core.dataset import Dataset
26+
from xarray.core.datatree import DataTree
2627
from xarray.core.types import NestedSequence
27-
from xarray.datatree_.datatree import DataTree
2828

2929
# Create a logger object, but don't add any handlers. Leave that to user code.
3030
logger = logging.getLogger(__name__)
@@ -137,8 +137,8 @@ def _open_datatree_netcdf(
137137
**kwargs,
138138
) -> DataTree:
139139
from xarray.backends.api import open_dataset
140+
from xarray.core.datatree import DataTree
140141
from xarray.core.treenode import NodePath
141-
from xarray.datatree_.datatree import DataTree
142142

143143
ds = open_dataset(filename_or_obj, **kwargs)
144144
tree_root = DataTree.from_dict({"/": ds})

xarray/backends/h5netcdf_.py

Lines changed: 12 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,7 @@
2828
from xarray.core import indexing
2929
from xarray.core.utils import (
3030
FrozenDict,
31+
emit_user_level_warning,
3132
is_remote_uri,
3233
read_magic_number_from_file,
3334
try_read_magic_number_from_file_or_path,
@@ -39,7 +40,7 @@
3940

4041
from xarray.backends.common import AbstractDataStore
4142
from xarray.core.dataset import Dataset
42-
from xarray.datatree_.datatree import DataTree
43+
from xarray.core.datatree import DataTree
4344

4445

4546
class H5NetCDFArrayWrapper(BaseNetCDF4Array):
@@ -58,21 +59,23 @@ def _getitem(self, key):
5859
return array[key]
5960

6061

61-
def maybe_decode_bytes(txt):
62-
if isinstance(txt, bytes):
63-
return txt.decode("utf-8")
64-
else:
65-
return txt
66-
67-
6862
def _read_attributes(h5netcdf_var):
6963
# GH451
7064
# to ensure conventions decoding works properly on Python 3, decode all
7165
# bytes attributes to strings
7266
attrs = {}
7367
for k, v in h5netcdf_var.attrs.items():
7468
if k not in ["_FillValue", "missing_value"]:
75-
v = maybe_decode_bytes(v)
69+
if isinstance(v, bytes):
70+
try:
71+
v = v.decode("utf-8")
72+
except UnicodeDecodeError:
73+
emit_user_level_warning(
74+
f"'utf-8' codec can't decode bytes for attribute "
75+
f"{k!r} of h5netcdf object {h5netcdf_var.name!r}, "
76+
f"returning bytes undecoded.",
77+
UnicodeWarning,
78+
)
7679
attrs[k] = v
7780
return attrs
7881

0 commit comments

Comments
 (0)