Skip to content

Commit 18f41d4

Browse files
authored
add a section showing the parse_data_type function (#3249)
1 parent 8405073 commit 18f41d4

File tree

3 files changed

+72
-1
lines changed

3 files changed

+72
-1
lines changed

changes/3249.doc.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
Expand the data type docs to include a demonstration of the ``parse_data_type`` function.
2+
Expand the docstring for the ``parse_data_type`` function.

docs/user-guide/data_types.rst

Lines changed: 44 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -409,4 +409,47 @@ We want to avoid a situation where the same native data type matches multiple Za
409409
a NumPy data type should *uniquely* specify a single Zarr data type. But data type resolution is
410410
dynamic, so it's not possible to statically guarantee this uniqueness constraint. Therefore, we
411411
attempt data type resolution against *every* data type class, and if, for some reason, a native data
412-
type matches multiple Zarr data types, we treat this as an error and raise an exception.
412+
type matches multiple Zarr data types, we treat this as an error and raise an exception.
413+
414+
If you have a NumPy data type and you want to get the corresponding ``ZDType`` instance, you can use
415+
the ``parse_data_type`` function, which will use the dynamic resolution described above. ``parse_data_type``
416+
handles a range of input types:
417+
418+
- NumPy data types:
419+
420+
.. code-block:: python
421+
422+
>>> import numpy as np
423+
>>> from zarr.dtype import parse_data_type
424+
>>> my_dtype = np.dtype('>M8[10s]')
425+
>>> parse_data_type(my_dtype, zarr_format=2)
426+
DateTime64(endianness='big', scale_factor=10, unit='s')
427+
428+
429+
- NumPy data type-compatible strings:
430+
431+
.. code-block:: python
432+
433+
>>> dtype_str = '>M8[10s]'
434+
>>> parse_data_type(dtype_str, zarr_format=2)
435+
DateTime64(endianness='big', scale_factor=10, unit='s')
436+
437+
- ``ZDType`` instances:
438+
439+
.. code-block:: python
440+
441+
>>> from zarr.dtype import DateTime64
442+
>>> zdt = DateTime64(endianness='big', scale_factor=10, unit='s')
443+
>>> parse_data_type(zdt, zarr_format=2) # Use a ZDType (this is a no-op)
444+
DateTime64(endianness='big', scale_factor=10, unit='s')
445+
446+
- Python dictionaries (requires ``zarr_format=3``). These dictionaries must be consistent with the
447+
``JSON`` form of the data type:
448+
449+
.. code-block:: python
450+
451+
>>> dt_dict = {"name": "numpy.datetime64", "configuration": {"unit": "s", "scale_factor": 10}}
452+
>>> parse_data_type(dt_dict, zarr_format=3)
453+
DateTime64(endianness='little', scale_factor=10, unit='s')
454+
>>> parse_data_type(dt_dict, zarr_format=3).to_json(zarr_format=3)
455+
{'name': 'numpy.datetime64', 'configuration': {'unit': 's', 'scale_factor': 10}}

src/zarr/core/dtype/__init__.py

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -189,6 +189,32 @@ def parse_data_type(
189189
) -> ZDType[TBaseDType, TBaseScalar]:
190190
"""
191191
Interpret the input as a ZDType instance.
192+
193+
Parameters
194+
----------
195+
dtype_spec : ZDTypeLike
196+
The input to be interpreted as a ZDType instance. This could be a native data type
197+
(e.g., a NumPy data type), a Python object that can be converted into a native data type,
198+
a ZDType instance (in which case the input is returned unchanged), or a JSON object
199+
representation of a data type.
200+
zarr_format : ZarrFormat
201+
The zarr format version.
202+
203+
Returns
204+
-------
205+
ZDType[TBaseDType, TBaseScalar]
206+
The ZDType instance corresponding to the input.
207+
208+
Examples
209+
--------
210+
>>> from zarr.dtype import parse_data_type
211+
>>> import numpy as np
212+
>>> parse_data_type("int32", zarr_format=2)
213+
Int32(endianness='little')
214+
>>> parse_data_type(np.dtype('S10'), zarr_format=2)
215+
NullTerminatedBytes(length=10)
216+
>>> parse_data_type({"name": "numpy.datetime64", "configuration": {"unit": "s", "scale_factor": 10}}, zarr_format=3)
217+
DateTime64(endianness='little', scale_factor=10, unit='s')
192218
"""
193219
if isinstance(dtype_spec, ZDType):
194220
return dtype_spec

0 commit comments

Comments
 (0)