Skip to content

Commit 0e43784

Browse files
committed
DOC: Add sharding documentation
1 parent 89508fd commit 0e43784

File tree

3 files changed

+55
-2
lines changed

3 files changed

+55
-2
lines changed

README.md

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,8 @@ implementation.
2424
- Reads OME-Zarr v0.1 to v0.5 into simple Python data classes with Dask arrays
2525
- Optional OME-Zarr data model validation during reading
2626
- Writes OME-Zarr v0.4 to v0.5
27-
- Optional writing via [tensorstore](https://google.github.io/tensorstore/)
27+
- [Sharded Zarr] stores
28+
- Optional writing via [tensorstore]
2829

2930
## Documentation
3031

@@ -46,3 +47,6 @@ how to contribute can be found in
4647

4748
`ngff-zarr` is distributed under the terms of the
4849
[MIT](https://spdx.org/licenses/MIT.html) license.
50+
51+
[Sharded Zarr]: https://zarr.dev/zeps/accepted/ZEP0002.html
52+
[tensorstore]: https://google.github.io/tensorstore/

docs/index.md

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,8 @@ A lean and kind
2222
- Reads OME-Zarr v0.1 to v0.5 into simple Python data classes with Dask arrays
2323
- Optional OME-Zarr data model validation during reading
2424
- Writes OME-Zarr v0.4 to v0.5
25-
- Optional writing via [tensorstore](https://google.github.io/tensorstore/)
25+
- [Sharded Zarr] stores
26+
- Optional writing via [tensorstore]
2627

2728
```{toctree}
2829
:maxdepth: 2
@@ -42,3 +43,6 @@ development.md
4243
4344
apidocs/index.rst
4445
```
46+
47+
[Sharded Zarr]: https://zarr.dev/zeps/accepted/ZEP0002.html
48+
[tensorstore]: https://google.github.io/tensorstore/

docs/python.md

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -201,6 +201,50 @@ also be used.
201201

202202
The multiscales will be computed and written out-of-core, limiting memory usage.
203203

204+
## Write a sharded OME-Zarr store
205+
206+
[Sharded zarr] stores save multiple compressed chunks in a single file or blob.
207+
This can be useful for large datasets, as it can reduce the number of files in a
208+
directory.
209+
210+
To generate a sharded OME-Zarr store, pass the `chunks_per_shard` kwarg to
211+
`to_ngff_zarr`. Sharding requires OME-Zarr version 0.5, which uses the Zarr
212+
Format Specification 3.
213+
214+
This can be a single integer,
215+
216+
```python
217+
version = '0.5'
218+
nz.to_ngff_zarr('lightsheet.ome.zarr',
219+
multiscales,
220+
chunks_per_shard=2,
221+
version=version)
222+
```
223+
224+
This will use 2 chunks per shard for all dimensions.
225+
226+
Or, specify a tuple of integers for each dimension.
227+
228+
```python
229+
nz.to_ngff_zarr('lightsheet.ome.zarr',
230+
multiscales,
231+
chunks_per_shard=(2, 2, 4),
232+
version=version)
233+
```
234+
235+
Or, specify a dictionary of integers for each dimension.
236+
237+
```python
238+
nz.to_ngff_zarr('lightsheet.ome.zarr',
239+
multiscales,
240+
chunks_per_shard={'z':4, 'y':2, 'x':2},
241+
version=version)
242+
```
243+
244+
The resulting shard shape will be the product of the chunk shape and the
245+
`chunks_per_shard` shape. In this case the shard shape will be `(256, 128, 128)`
246+
for a chunk shape of `(64, 64, 64)`.
247+
204248
### Writing with Tensorstore
205249

206250
To write with [tensorstore], which may provide better performance, use the
@@ -243,4 +287,5 @@ to_ngff_zarr('cthead1_zarr2.ome.zarr', multiscales, version='0.4')
243287
[`to_ngff_image`]: ./apidocs/ngff_zarr/ngff_zarr.to_ngff_image.md
244288
[`to_multiscales`]: ./apidocs/ngff_zarr/ngff_zarr.to_multiscales.md
245289
[`from_ngff_zarr`]: ./apidocs/ngff_zarr/ngff_zarr.from_ngff_zarr.md
290+
[Sharded Zarr]: https://zarr.dev/zeps/accepted/ZEP0002.html
246291
[tensorstore]: https://google.github.io/tensorstore/

0 commit comments

Comments
 (0)