-
Notifications
You must be signed in to change notification settings - Fork 0
add support for top-level custom zarr extensions using json schema references #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
ignoring the question of whether this is permitted by the spec, as that seems like the source of some disagreement, what is the advantage of putting the extensions at the top-level metadata document instead of in a dedicated |
Only because I didn't want to make a breaking change and figured that the Zarr spec was already using top-level keys for extensions. If that's not the case because "extensions" like chunk_type are more "core-like" than "extension-like" then I agree it doesn't really matter where in the node they are placed. The placement of extensions in the node is much less important than linking to explicit JSON Schemas that reference what extensions are included in the node and give readers a way to validate the node against those extension types. |
This is currently not happening. that might have been a source of confusion. |
that text was added recently, and I agree that it's not very clear. sorry about that.
assuming your extension would be associated with an array, does your extension change anything about the process of reading / writing chunks for that array, or does the extension just provide information about the stuff inside the array? In the latter case, you can use the |
It changes how a library like I currently have it stored in At the end of the day I only care about following Zarr's extension mechanisms as closely as possible so whatever I build is interoperable with the rest of the ecosystem. |
this makes total sense, and I'm sorry that the spec right now isn't clear about how things work. If you do have the time I would find it useful if you could open an issue in the zarr specs repo to express your issues with the way the spec is currently written. I think the spec should be as clear as possible about things like this, so if we are failing there, that's something to fix. |
Big thanks for this work, @geospatial-jeff. 👍 I'll tend towards commenting on the ZEP, partially because I won't get or at least notice notifications from this PR. |
The goal of this PR is to provide an example of how
python-zarr
could be extended to support custom extensions. This PR is not complete, the intent is to inform discussion around ZEP10. There are several high level goals:chunk_grid
,data_type
).zarr-python
approach of modeling each node as a class.This PR proposes the addition of the
extension_schemas
key to the Zarr v3 spec. This is a physical key that may be present on any node type and contains an array of JSON Schemas indicating what extensions are present in the node. Similar tostac_extensions
in the STAC spec.It also proposes the addition of a logical
extensions
key tozarr-python
which contains all extensions implemented by the node, allowingzarr-python
to provide consistent access patterns to both custom and official Zarr extensions.I have no yet updated
ArrayMetadata
to support this, however I think it's fairly clear how that would work. The only difference is the array node type currently implements several official Zarr extensions which would be rolled into the logicalextensions
member for consistent access alongside any other custom extensions.Importantly, this PR doesn't update
zarr-python
to do anything with these custom extensions besides parsing / validating them. Whether or not a reader such aszarr-python
does anything with an extension is based on extension maturity, target audience etc.