Skip to content

Commit 5355166

Browse files
authored
[skip-ci] Small updates to IO docs. (pydata#8452)
* [skip-ci] Small updates to IO docs. * [skip-ci] Whats new
1 parent dfe6435 commit 5355166

File tree

2 files changed

+24
-18
lines changed

2 files changed

+24
-18
lines changed

doc/user-guide/io.rst

Lines changed: 21 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -44,9 +44,9 @@ __ https://www.unidata.ucar.edu/software/netcdf/
4444

4545
.. _netCDF FAQ: https://www.unidata.ucar.edu/software/netcdf/docs/faq.html#What-Is-netCDF
4646

47-
Reading and writing netCDF files with xarray requires scipy or the
48-
`netCDF4-Python`__ library to be installed (the latter is required to
49-
read/write netCDF V4 files and use the compression options described below).
47+
Reading and writing netCDF files with xarray requires scipy, h5netcdf, or the
48+
`netCDF4-Python`__ library to be installed. SciPy only supports reading and writing
49+
of netCDF V3 files.
5050

5151
__ https://github.com/Unidata/netcdf4-python
5252

@@ -675,8 +675,8 @@ the same as the one that was saved.
675675

676676
.. note::
677677

678-
xarray does not write NCZarr attributes. Therefore, NCZarr data must be
679-
opened in read-only mode.
678+
xarray does not write `NCZarr <https://docs.unidata.ucar.edu/nug/current/nczarr_head.html>`_ attributes.
679+
Therefore, NCZarr data must be opened in read-only mode.
680680

681681
To store variable length strings, convert them to object arrays first with
682682
``dtype=object``.
@@ -696,10 +696,10 @@ It is possible to read and write xarray datasets directly from / to cloud
696696
storage buckets using zarr. This example uses the `gcsfs`_ package to provide
697697
an interface to `Google Cloud Storage`_.
698698

699-
From v0.16.2: general `fsspec`_ URLs are parsed and the store set up for you
700-
automatically when reading, such that you can open a dataset in a single
701-
call. You should include any arguments to the storage backend as the
702-
key ``storage_options``, part of ``backend_kwargs``.
699+
General `fsspec`_ URLs, those that begin with ``s3://`` or ``gcs://`` for example,
700+
are parsed and the store set up for you automatically when reading.
701+
You should include any arguments to the storage backend as the
702+
key ```storage_options``, part of ``backend_kwargs``.
703703

704704
.. code:: python
705705
@@ -715,7 +715,7 @@ key ``storage_options``, part of ``backend_kwargs``.
715715
This also works with ``open_mfdataset``, allowing you to pass a list of paths or
716716
a URL to be interpreted as a glob string.
717717

718-
For older versions, and for writing, you must explicitly set up a ``MutableMapping``
718+
For writing, you must explicitly set up a ``MutableMapping``
719719
instance and pass this, as follows:
720720

721721
.. code:: python
@@ -769,10 +769,10 @@ Consolidated Metadata
769769
~~~~~~~~~~~~~~~~~~~~~
770770

771771
Xarray needs to read all of the zarr metadata when it opens a dataset.
772-
In some storage mediums, such as with cloud object storage (e.g. amazon S3),
772+
In some storage mediums, such as with cloud object storage (e.g. `Amazon S3`_),
773773
this can introduce significant overhead, because two separate HTTP calls to the
774774
object store must be made for each variable in the dataset.
775-
As of xarray version 0.18, xarray by default uses a feature called
775+
By default Xarray uses a feature called
776776
*consolidated metadata*, storing all metadata for the entire dataset with a
777777
single key (by default called ``.zmetadata``). This typically drastically speeds
778778
up opening the store. (For more information on this feature, consult the
@@ -796,16 +796,20 @@ reads. Because this fall-back option is so much slower, xarray issues a
796796

797797
.. _io.zarr.appending:
798798

799-
Appending to existing Zarr stores
800-
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
799+
Modifying existing Zarr stores
800+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
801801

802802
Xarray supports several ways of incrementally writing variables to a Zarr
803803
store. These options are useful for scenarios when it is infeasible or
804804
undesirable to write your entire dataset at once.
805805

806+
1. Use ``mode='a'`` to add or overwrite entire variables,
807+
2. Use ``append_dim`` to resize and append to exiting variables, and
808+
3. Use ``region`` to write to limited regions of existing arrays.
809+
806810
.. tip::
807811

808-
If you can load all of your data into a single ``Dataset`` using dask, a
812+
For ``Dataset`` objects containing dask arrays, a
809813
single call to ``to_zarr()`` will write all of your data in parallel.
810814

811815
.. warning::
@@ -876,8 +880,8 @@ and then calling ``to_zarr`` with ``compute=False`` to write only metadata
876880
ds.to_zarr(path, compute=False)
877881
878882
Now, a Zarr store with the correct variable shapes and attributes exists that
879-
can be filled out by subsequent calls to ``to_zarr``. ``region`` can be
880-
specified as ``"auto"``, which opens the existing store and determines the
883+
can be filled out by subsequent calls to ``to_zarr``.
884+
Setting ``region="auto"`` will open the existing store and determine the
881885
correct alignment of the new data with the existing coordinates, or as an
882886
explicit mapping from dimension names to Python ``slice`` objects indicating
883887
where the data should be written (in index space, not label space), e.g.,

doc/whats-new.rst

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -37,7 +37,7 @@ Breaking changes
3737
~~~~~~~~~~~~~~~~
3838
- drop support for `cdms2 <https://github.com/CDAT/cdms>`_. Please use
3939
`xcdat <https://github.com/xCDAT/xcdat>`_ instead (:pull:`8441`).
40-
By `Justus Magin <https://github.com/keewis`_.
40+
By `Justus Magin <https://github.com/keewis>`_.
4141

4242
- Following pandas, :py:meth:`infer_freq` will return ``"Y"``, ``"YS"``,
4343
``"QE"``, ``"ME"``, ``"h"``, ``"min"``, ``"s"``, ``"ms"``, ``"us"``, or
@@ -94,6 +94,8 @@ Bug fixes
9494

9595
Documentation
9696
~~~~~~~~~~~~~
97+
- Small updates to documentation on distributed writes: See :ref:`io.zarr.appending` to Zarr.
98+
By `Deepak Cherian <https://github.com/dcherian>`_.
9799

98100

99101
Internal Changes

0 commit comments

Comments
 (0)