Skip to content

ZEP9 (phase 1): add clarifications for extension naming #330

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 67 commits into from
Apr 17, 2025
Merged
Show file tree
Hide file tree
Changes from 66 commits
Commits
Show all changes
67 commits
Select commit Hold shift + click to select a range
65bc69f
Merge Davis proposal with ZEP0009
joshmoore Feb 12, 2025
b109fb7
Start changelog
joshmoore Feb 12, 2025
4c0e494
Add definitions
joshmoore Feb 12, 2025
05b4fa4
Fix definitions
joshmoore Feb 12, 2025
f9508d4
slightly longer change log
joshmoore Feb 12, 2025
34ac282
New extensions section
joshmoore Feb 12, 2025
16e34ca
Update array metadata section
joshmoore Feb 12, 2025
d2f6f9d
Update group metadata section
joshmoore Feb 12, 2025
1d85e70
Clean the extension listing pages
joshmoore Feb 12, 2025
43e3862
Also list no datatypes as defined
joshmoore Feb 12, 2025
c1accfe
Link more terms to extensions
joshmoore Feb 12, 2025
454faaf
More crosslinks and identifier clarifications
joshmoore Feb 12, 2025
ef69ff1
add zarr-extensions repo
normanrz Feb 17, 2025
db7db15
Merge remote-tracking branch 'origin/main' into zep9-ext-naming
normanrz Feb 17, 2025
3d448c1
Remove TODOs with PR and repo link
joshmoore Feb 18, 2025
e6200c8
Move 'core data types' to a subpage
joshmoore Feb 18, 2025
0e0a03b
Clarify concept of 'core'
joshmoore Feb 18, 2025
1600ee9
Unify listing of all extensions on subpages
joshmoore Feb 18, 2025
429988a
Rename core/v3.0 to core/index
joshmoore Feb 18, 2025
c6fb150
Correct extensions table links
joshmoore Feb 18, 2025
46630f7
Add 'core' to each of the subpages
joshmoore Feb 18, 2025
20d6457
Simplify subpage headers
joshmoore Feb 19, 2025
a1d52b1
simplify reference to extensions in field definition
joshmoore Feb 19, 2025
2b04661
simplify reference to extensions in field definition
joshmoore Feb 19, 2025
1fa95e8
Implement suggestions from d-v-b and s/URI/URL/
joshmoore Feb 20, 2025
95eef76
Move all v3 subdocs to index.rst
joshmoore Feb 20, 2025
fa0bdf6
Minor correction to a ref
joshmoore Feb 20, 2025
3d86775
Add chunk key and grid subdocuments
joshmoore Feb 20, 2025
7cfa69b
Clarify ext pts vs exts
joshmoore Feb 20, 2025
9fdbd81
Clarify version policy applies independently to each page
joshmoore Feb 21, 2025
7eef2b3
Make terms in ext list nicer
joshmoore Feb 21, 2025
9e0f9d3
Drop versions from spec URIs
joshmoore Feb 21, 2025
4e1bec8
Fix plurality of chunk grids page
joshmoore Feb 21, 2025
f2c977d
Unify all index pages into subdirectories
joshmoore Feb 21, 2025
d8c88ec
Catch a few last references of URIs rather than URLs
joshmoore Feb 21, 2025
3844fc9
Improve changelog
joshmoore Feb 21, 2025
447ad8c
Add Norman
joshmoore Feb 21, 2025
6de00f1
Apply suggestions from code review
joshmoore Feb 24, 2025
a791ae5
fill_value
normanrz Feb 24, 2025
4ba9e3e
versioning
normanrz Feb 24, 2025
6e7da25
Make must_understand a section
joshmoore Mar 1, 2025
8fd39d2
start with a lower-case letter
joshmoore Mar 1, 2025
65e46f8
unify plurality of ext lists
joshmoore Mar 1, 2025
90243d5
Improve URL description
joshmoore Mar 1, 2025
73cce52
discourage top-level must_understand=false
joshmoore Mar 1, 2025
a0c3170
Add author guidance
joshmoore Mar 1, 2025
dbeb796
change extension example
normanrz Mar 1, 2025
2724648
use namespaced names
rabernat Mar 6, 2025
5c03a24
add URI as names section
rabernat Mar 13, 2025
1ba5e41
Update docs/v3/core/index.rst
joshmoore Mar 16, 2025
c4a7215
Merge pull request #1 from rabernat/tweak-zep9-ext-naming
joshmoore Mar 16, 2025
9eea54e
Minor updates to front matter
joshmoore Mar 16, 2025
40732c0
Move fill_value to data_type section
joshmoore Mar 16, 2025
7bb1cb1
Rename sections
joshmoore Mar 17, 2025
7b1baff
Merge multiple edits
joshmoore Mar 25, 2025
aa9d87e
cleanup
joshmoore Mar 26, 2025
fac16fb
Fix objection typo
joshmoore Mar 26, 2025
b24440e
Minor fix
joshmoore Mar 10, 2025
2861c84
Merge remote-tracking branch 'origin/main' into zep9-ext-naming
joshmoore Mar 26, 2025
2d684a4
Apply suggestions from code review
joshmoore Mar 29, 2025
bf12d64
Introduction of tag:
joshmoore Mar 29, 2025
97c483b
Update docs/v3/core/index.rst
joshmoore Mar 29, 2025
6c1a027
Reduce to minimal change
joshmoore Apr 2, 2025
8dc7796
Add feedback from ZSC
joshmoore Apr 2, 2025
4ac0126
Correct whitespace
joshmoore Apr 2, 2025
dd90a3f
Remove confusingly redundant must_understand block from groups
joshmoore Apr 2, 2025
7ee6d31
Make chunk-key-encoding info a warning
joshmoore Apr 17, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 14 additions & 0 deletions docs/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -87,4 +87,18 @@

redirects = {
"index": "specs.html",
"v3/core/v3.0.html": "./index.html",
"v3/codecs/blosc/v1.0.rst": "./index.html",
"v3/codecs/bytes/v1.0.rst": "./index.html",
"v3/codecs/crc32c/v1.0.rst": "./index.html",
"v3/codecs/gzip/v1.0.rst": "./index.html",
"v3/codecs/sharding-indexed/v1.0.rst": "./index.html",
"v3/codecs/transpose/v1.0.rst": "./index.html",
"v3/stores/filesystem/v1.0.rst": "./index.html",
"v3/chunk-grid.rst": "chunk-grids/index.rst",
"v3/chunk-key-encoding.rst": "chunk-key-encodings/index.html",
"v3/codecs.rst": "codecs/index.html",
"v3/data-types.rst": "data-types/index.html",
"v3/array-storage-transformers.rst": "storage-transformers/index.html",
"v3/stores.rst": "stores/index.html",
}
2 changes: 1 addition & 1 deletion docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
Specs
=====

A good starting point is the :ref:`zarr-core-specification-v3.0`.
A good starting point is the :ref:`zarr-core-specification-v3`.

.. toctree::

Expand Down
12 changes: 7 additions & 5 deletions docs/specs.rst
Original file line number Diff line number Diff line change
Expand Up @@ -8,11 +8,13 @@ Specifications
:maxdepth: 1
:caption: v3

Core <v3/core/v3.0>
v3/data-types
v3/codecs
v3/stores
v3/array-storage-transformers
Core <v3/core/index>
v3/codecs/index
v3/chunk-grids/index
v3/chunk-key-encodings/index
v3/data-types/index
v3/stores/index
v3/storage-transformers/index

.. toctree::
:maxdepth: 1
Expand Down
13 changes: 0 additions & 13 deletions docs/v3/array-storage-transformers.rst

This file was deleted.

22 changes: 22 additions & 0 deletions docs/v3/chunk-grids/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
.. _chunk-grid-list:

===========
Chunk Grids
===========

The following documents specify chunk grids which SHOULD
be implemented by all implementations.

.. toctree::
:glob:
:maxdepth: 1
:titlesonly:
:caption: Contents:

*/*

Extensions
----------

Registered chunk grid extensions can be found under
`zarr-extensions::chunk-grids <https://github.com/zarr-developers/zarr-extensions/tree/main/chunk-grids>`_.
117 changes: 117 additions & 0 deletions docs/v3/chunk-grids/regular-grid/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,117 @@

.. _regular-chunkgrid:

==================
Regular chunk grid
==================

Version:
1.0
Specification URI:
https://zarr-specs.readthedocs.io/en/latest/v3/chunk-grids/regular-grid/
Corresponding ZEP:
`ZEP0001 — Zarr specification version 3 <https://zarr.dev/zeps/draft/ZEP0001.html>`_
Issue tracking:
`GitHub issues <https://github.com/zarr-developers/zarr-specs/labels/chunk-grid>`_
Suggest an edit for this spec:
`GitHub editor <https://github.com/zarr-developers/zarr-specs/blob/main/docs/v3/chunk-grids/regular-grid/index.rst>`_

Copyright 2020-Present Zarr core development team. This work
is licensed under a `Creative Commons Attribution 3.0 Unported License
<https://creativecommons.org/licenses/by/3.0/>`_.

----

Abstract
========

A regular grid is a type of grid where an array is divided into chunks
such that each chunk is a hyperrectangle of the same shape. The
dimensionality of the grid is the same as the dimensionality of the
array. Each chunk in the grid can be addressed by a tuple of positive
integers (`k`, `j`, `i`, ...) corresponding to the indices of the
chunk along each dimension.

Description
===========

The origin element of a chunk has coordinates in the array space (`k` *
`dz`, `j` * `dy`, `i` * `dx`, ...) where (`dz`, `dy`, `dx`, ...) are
the chunk sizes along each dimension.
Thus the origin element of the chunk at grid index (0, 0, 0,
...) is at coordinate (0, 0, 0, ...) in the array space, i.e., the
grid is aligned with the origin of the array. If the length of any
array dimension is not perfectly divisible by the chunk length along
the same dimension, then the grid will overhang the edge of the array
space.

The shape of the chunk grid will be (ceil(`z` / `dz`), ceil(`y` /
`dy`), ceil(`x` / `dx`), ...) where (`z`, `y`, `x`, ...) is the array
shape, "/" is the division operator and "ceil" is the ceiling
function. For example, if a 3 dimensional array has shape (10, 200,
3000), and has chunk shape (5, 20, 400), then the shape of the chunk
grid will be (2, 10, 8), meaning that there will be 2 chunks along the
first dimension, 10 along the second dimension, and 8 along the third
dimension.

.. list-table:: Regular Grid Example
:header-rows: 1

* - Array Shape
- Chunk Shape
- Chunk Grid Shape
- Notes
* - (10, 200, 3000)
- (5, 20, 400)
- (2, 10, 8)
- The grid does overhang the edge of the array on the 3rd dimension.

An element of an array with coordinates (`c`, `b`, `a`, ...) will
occur within the chunk at grid index (`c` // `dz`, `b` // `dy`, `a` //
`dx`, ...), where "//" is the floor division operator. The element
will have coordinates (`c` % `dz`, `b` % `dy`, `a` % `dx`, ...) within
that chunk, where "%" is the modulo operator. For example, if a
3 dimensional array has shape (10, 200, 3000), and has chunk shape
(5, 20, 400), then the element of the array with coordinates (7, 150, 900)
is contained within the chunk at grid index (1, 7, 2) and has coordinates
(2, 10, 100) within that chunk.

The store key corresponding to a given grid cell is determined based on the
:ref:`array-metadata-chunk-key-encoding` member of the :ref:`array-metadata`.

Note that this specification does not consider the case where the
chunk grid and the array space are not aligned at the origin vertices
of the array and the chunk at grid index (0, 0, 0, ...). However,
extensions may define variations on the regular grid type
such that the grid indices may include negative integers, and the
origin element of the array may occur at an arbitrary position within
any chunk, which is required to allow arrays to be extended by an
arbitrary length in a "negative" direction along any dimension.

.. note:: Chunks at the border of an array always have the full chunk size, even when
the array only covers parts of it. For example, having an array with ``"shape": [30, 30]`` and
``"chunk_shape": [16, 16]``, the chunk ``0,1`` would also contain unused values for the indices
``0-16, 30-31``. When writing such chunks it is recommended to use the current fill value
for elements outside the bounds of the array.



Status of this document
=======================

ZEP0001 was accepted on May 15th, 2023 via https://github.com/zarr-developers/zarr-specs/issues/227.


Document conventions
====================

Conformance requirements are expressed with a combination of
descriptive assertions and [RFC2119]_ terminology. The key words
"MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD",
"SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in the normative
parts of this document are to be interpreted as described in
[RFC2119]_. However, for readability, these words do not appear in all
uppercase letters in this specification.

All of the text of this specification is normative except sections
explicitly marked as non-normative, examples, and notes. Examples in
70 changes: 70 additions & 0 deletions docs/v3/chunk-key-encodings/default/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
.. _default-chunkkeyencoding:

==========================
Default chunk key encoding
==========================

Version:
1.0
Specification URI:
https://zarr-specs.readthedocs.io/en/latest/v3/chunk-key-encodings/default/
Corresponding ZEP:
`ZEP0001 — Zarr specification version 3 <https://zarr.dev/zeps/draft/ZEP0001.html>`_
Issue tracking:
`GitHub issues <https://github.com/zarr-developers/zarr-specs/labels/chunk-grid>`_
Suggest an edit for this spec:
`GitHub editor <https://github.com/zarr-developers/zarr-specs/blob/main/docs/v3/chunk-key-encodings/default/index.rst>`_

Copyright 2020-Present Zarr core development team. This work
is licensed under a `Creative Commons Attribution 3.0 Unported License
<https://creativecommons.org/licenses/by/3.0/>`_.

----

Description
===========

The ``configuration`` object may contain one optional member,
``separator``, which must be either ``"/"`` or ``"."``. If not specified,
``separator`` defaults to ``"/"``.

The key for a chunk with grid index (``k``, ``j``, ``i``, ...) is
formed by taking the initial prefix ``c``, and appending for each dimension:

- the ``separator`` character, followed by,

- the ASCII decimal string representation of the chunk index within that dimension.

For example, in a 3 dimensional array, with a separator of ``/`` the identifier
for the chunk at grid index (1, 23, 45) is the string ``"c/1/23/45"``. With a
separator of ``.``, the identifier is the string ``"c.1.23.45"``. The initial prefix
``c`` ensures that metadata documents and chunks have separate prefixes.

.. note:: A main difference with spec v2 is that the default chunk separator
changed from ``.`` to ``/``, as in N5. This decreases the maximum number of
items in hierarchical stores like directory stores.

.. note:: Arrays may have 0 dimensions (when for example representing scalars),
in which case the coordinate of a chunk is the empty tuple, and the chunk key
will consist of the string ``c``.


Status of this document
=======================

ZEP0001 was accepted on May 15th, 2023 via https://github.com/zarr-developers/zarr-specs/issues/227.


Document conventions
====================

Conformance requirements are expressed with a combination of
descriptive assertions and [RFC2119]_ terminology. The key words
"MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD",
"SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in the normative
parts of this document are to be interpreted as described in
[RFC2119]_. However, for readability, these words do not appear in all
uppercase letters in this specification.

All of the text of this specification is normative except sections
explicitly marked as non-normative, examples, and notes. Examples in
22 changes: 22 additions & 0 deletions docs/v3/chunk-key-encodings/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
.. _chunk-key-encoding-list:

===================
Chunk Key Encodings
===================

The following documents specify chunk key encodings which SHOULD
be implemented by all implementations.

.. toctree::
:glob:
:maxdepth: 1
:titlesonly:
:caption: Contents:

*/*

Extensions
----------

Registered chunk grid extensions can be found under
`zarr-extensions::chunk-key-encodings <https://github.com/zarr-developers/zarr-extensions/tree/main/chunk-key-encodings>`_.
71 changes: 71 additions & 0 deletions docs/v3/chunk-key-encodings/v2/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
.. _v2-chunkkeyencoding:

=====================
v2 chunk key encoding
=====================

Version:
1.0
Specification URI:
https://zarr-specs.readthedocs.io/en/latest/v3/chunk-key-encodings/v2/
Corresponding ZEP:
`ZEP0001 — Zarr specification version 3 <https://zarr.dev/zeps/draft/ZEP0001.html>`_
Issue tracking:
`GitHub issues <https://github.com/zarr-developers/zarr-specs/labels/chunk-grid>`_
Suggest an edit for this spec:
`GitHub editor <https://github.com/zarr-developers/zarr-specs/blob/main/docs/v3/chunk-key-encodings/v2/index.rst>`_

Copyright 2020-Present Zarr core development team. This work
is licensed under a `Creative Commons Attribution 3.0 Unported License
<https://creativecommons.org/licenses/by/3.0/>`_.

----

Description
===========

The ``configuration`` object may contain one optional member,
``separator``, which must be either ``"/"`` or ``"."``. If not specified,
``separator`` defaults to ``"."``.

The identifier for chunk with at least one dimension is formed by
concatenating for each dimension:

- the ASCII decimal string representation of the chunk index within that
dimension, followed by

- the ``separator`` character, except that it is omitted for the last
dimension.

For example, in a 3 dimensional array, with a separator of ``.`` the identifier
for the chunk at grid index (1, 23, 45) is the string ``"1.23.45"``. With a
separator of ``/``, the identifier is the string ``"1/23/45"``.

For chunk grids with 0 dimensions, the single chunk has the key ``"0"``.

.. note::

This encoding is intended only to allow existing v2 arrays to be
converted to v3 without having to rename chunks. It is not recommended
to be used when writing new arrays.


Status of this document
=======================

ZEP0001 was accepted on May 15th, 2023 via https://github.com/zarr-developers/zarr-specs/issues/227.


Document conventions
====================

Conformance requirements are expressed with a combination of
descriptive assertions and [RFC2119]_ terminology. The key words
"MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD",
"SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in the normative
parts of this document are to be interpreted as described in
[RFC2119]_. However, for readability, these words do not appear in all
uppercase letters in this specification.

All of the text of this specification is normative except sections
explicitly marked as non-normative, examples, and notes. Examples in
13 changes: 0 additions & 13 deletions docs/v3/codecs.rst

This file was deleted.

Loading