|
| 1 | +```{eval-rst} |
| 2 | +:tocdepth: 3 |
| 3 | +``` |
| 4 | + |
| 5 | +```{currentModule} mdio.schemas.chunk_grid |
| 6 | +
|
| 7 | +``` |
| 8 | + |
| 9 | +# Chunk Grid Models |
| 10 | + |
| 11 | +```{article-info} |
| 12 | +:author: Altay Sansal |
| 13 | +:date: "{sub-ref}`today`" |
| 14 | +:read-time: "{sub-ref}`wordcount-minutes` min read" |
| 15 | +:class-container: sd-p-0 sd-outline-muted sd-rounded-3 sd-font-weight-light |
| 16 | +``` |
| 17 | + |
| 18 | +The variables in MDIO data model can represent different types of chunk grids. |
| 19 | +These grids are essential for managing multi-dimensional data arrays efficiently. |
| 20 | +In this breakdown, we will explore four distinct data models within the MDIO schema, |
| 21 | +each serving a specific purpose in data handling and organization. |
| 22 | + |
| 23 | +MDIO implements data models following the guidelines of the Zarr v3 spec and ZEPs: |
| 24 | + |
| 25 | +- [Zarr core specification (version 3)](https://zarr-specs.readthedocs.io/en/latest/v3/core/v3.0.html) |
| 26 | +- [ZEP 1 — Zarr specification version 3](https://zarr.dev/zeps/accepted/ZEP0001.html) |
| 27 | +- [ZEP 3 — Variable chunking](https://zarr.dev/zeps/draft/ZEP0003.html) |
| 28 | + |
| 29 | +## Regular Grid |
| 30 | + |
| 31 | +The regular grid models are designed to represent a rectangular and regularly |
| 32 | +paced chunk grid. |
| 33 | + |
| 34 | +```{eval-rst} |
| 35 | +.. autosummary:: |
| 36 | + RegularChunkGrid |
| 37 | + RegularChunkShape |
| 38 | +``` |
| 39 | + |
| 40 | +For 1D array with `size = 31`{l=python}, we can divide it into 5 equally sized |
| 41 | +chunks. Note that the last chunk will be truncated to match the size of the array. |
| 42 | + |
| 43 | +`{ "name": "regular", "configuration": { "chunkShape": [7] } }`{l=json} |
| 44 | + |
| 45 | +Using the above schema resulting array chunks will look like this: |
| 46 | + |
| 47 | +```bash |
| 48 | + ←─ 7 ─→ ←─ 7 ─→ ←─ 7 ─→ ←─ 7 ─→ ↔ 3 |
| 49 | +┌───────┬───────┬───────┬───────┬───┐ |
| 50 | +└───────┴───────┴───────┴───────┴───┘ |
| 51 | +``` |
| 52 | + |
| 53 | +For 2D array with shape `rows, cols = (7, 17)`{l=python}, we can divide it into 9 |
| 54 | +equally sized chunks. |
| 55 | + |
| 56 | +`{ "name": "regular", "configuration": { "chunkShape": [3, 7] } }`{l=json} |
| 57 | + |
| 58 | +Using the above schema, the resulting 2D array chunks will look like below. |
| 59 | +Note that the rows and columns are conceptual and visually not to scale. |
| 60 | + |
| 61 | +```bash |
| 62 | + ←─ 7 ─→ ←─ 7 ─→ ↔ 3 |
| 63 | +┌───────┬───────┬───┐ |
| 64 | +│ ╎ ╎ │ ↑ |
| 65 | +│ ╎ ╎ │ 3 |
| 66 | +│ ╎ ╎ │ ↓ |
| 67 | +├╶╶╶╶╶╶╶┼╶╶╶╶╶╶╶┼╶╶╶┤ |
| 68 | +│ ╎ ╎ │ ↑ |
| 69 | +│ ╎ ╎ │ 3 |
| 70 | +│ ╎ ╎ │ ↓ |
| 71 | +├╶╶╶╶╶╶╶┼╶╶╶╶╶╶╶┼╶╶╶┤ |
| 72 | +│ ╎ ╎ │ ↕ 1 |
| 73 | +└───────┴───────┴───┘ |
| 74 | +``` |
| 75 | + |
| 76 | +## Rectilinear Grid |
| 77 | + |
| 78 | +The [RectilinearChunkGrid](RectilinearChunkGrid) model extends |
| 79 | +the concept of chunk grids to accommodate rectangular and irregularly spaced chunks. |
| 80 | +This model is useful in data structures where non-uniform chunk sizes are necessary. |
| 81 | +[RectilinearChunkShape](RectilinearChunkShape) specifies the chunk sizes for each |
| 82 | +dimension as a list allowing for irregular intervals. |
| 83 | + |
| 84 | +```{eval-rst} |
| 85 | +.. autosummary:: |
| 86 | + RectilinearChunkGrid |
| 87 | + RectilinearChunkShape |
| 88 | +``` |
| 89 | + |
| 90 | +:::{note} |
| 91 | +It's important to ensure that the sum of the irregular spacings specified |
| 92 | +in the `chunkShape` matches the size of the respective array dimension. |
| 93 | +::: |
| 94 | + |
| 95 | +For 1D array with `size = 39`{l=python}, we can divide it into 5 irregular sized |
| 96 | +chunks. |
| 97 | + |
| 98 | +`{ "name": "rectilinear", "configuration": { "chunkShape": [[10, 7, 5, 7, 10]] } }`{l=json} |
| 99 | + |
| 100 | +Using the above schema resulting array chunks will look like this: |
| 101 | + |
| 102 | +```bash |
| 103 | + ←── 10 ──→ ←─ 7 ─→ ← 5 → ←─ 7 ─→ ←── 10 ──→ |
| 104 | +┌──────────┬───────┬─────┬───────┬──────────┐ |
| 105 | +└──────────┴───────┴─────┴───────┴──────────┘ |
| 106 | +``` |
| 107 | + |
| 108 | +For 2D array with shape `rows, cols = (7, 25)`{l=python}, we can divide it into 12 |
| 109 | +rectilinear (rectangular bur irregular) chunks. Note that the rows and columns are |
| 110 | +conceptual and visually not to scale. |
| 111 | + |
| 112 | +`{ "name": "rectilinear", "configuration": { "chunkShape": [[3, 1, 3], [10, 5, 7, 3]] } }`{l=json} |
| 113 | + |
| 114 | +```bash |
| 115 | + ←── 10 ──→ ← 5 → ←─ 7 ─→ ↔ 3 |
| 116 | +┌──────────┬─────┬───────┬───┐ |
| 117 | +│ ╎ ╎ ╎ │ ↑ |
| 118 | +│ ╎ ╎ ╎ │ 3 |
| 119 | +│ ╎ ╎ ╎ │ ↓ |
| 120 | +├╶╶╶╶╶╶╶╶╶╶┼╶╶╶╶╶┼╶╶╶╶╶╶╶┼╶╶╶┤ |
| 121 | +│ ╎ ╎ ╎ │ ↕ 1 |
| 122 | +├╶╶╶╶╶╶╶╶╶╶┼╶╶╶╶╶┼╶╶╶╶╶╶╶┼╶╶╶┤ |
| 123 | +│ ╎ ╎ ╎ │ ↑ |
| 124 | +│ ╎ ╎ ╎ │ 3 |
| 125 | +│ ╎ ╎ ╎ │ ↓ |
| 126 | +└──────────┴─────┴───────┴───┘ |
| 127 | +``` |
| 128 | + |
| 129 | +## Model Reference |
| 130 | + |
| 131 | +:::{dropdown} RegularChunkGrid |
| 132 | +:animate: fade-in-slide-down |
| 133 | + |
| 134 | +```{eval-rst} |
| 135 | +.. autopydantic_model:: RegularChunkGrid |
| 136 | +
|
| 137 | +---------- |
| 138 | +
|
| 139 | +.. autopydantic_model:: RegularChunkShape |
| 140 | +``` |
| 141 | + |
| 142 | +::: |
| 143 | +:::{dropdown} RectilinearChunkGrid |
| 144 | +:animate: fade-in-slide-down |
| 145 | + |
| 146 | +```{eval-rst} |
| 147 | +.. autopydantic_model:: RectilinearChunkGrid |
| 148 | +
|
| 149 | +---------- |
| 150 | +
|
| 151 | +.. autopydantic_model:: RectilinearChunkShape |
| 152 | +``` |
| 153 | + |
| 154 | +::: |
0 commit comments