-
Notifications
You must be signed in to change notification settings - Fork 2
Open
Description
I've now started the development of the inference vs persistence plots. I thought I would just provide my initial ideas of the design of the package, just to be aligned on these ideas before coding too much. Please provide any comments/thoughts/feedback!:)
- First of all, I thought this could be the start on the development of a verification package for the mllam community, hence the name of the repo "mllam-verification".
- I tried to follow and copy the repo structure of mllam-dataprep as a starting point.
- I've defined an initial config file example with the following layout:
schema_version: v0.1.0
inputs:
datasets:
initial:
path: /path/to/initial.zarr
target:
path: /path/to/target.zarr
prediction:
path: /path/to/prediction.zarr
variables:
- 2t
- 10u
coord_ranges:
time:
start: 1990-09-03T00:00
end: 1990-09-09T00:00
step: PT3H
methods:
- global_persistence
- gridpoint_persistence
output:
path: /path/to/output/directory
- Since I like pydantic, I propose to use that for validation.
Some thoughts related to this structure:
- I was about to call the "target" dataset for "truth", but since I thought we would also like to use the package to e.g. verify the inference of one model against the inference of another, I went with "target". So, the "target" dataset is what we want to verify against, and the "prediction" dataset is what we want to verify.
- I added "coord_ranges" to make it possible to only verify a subset e.g. in time or space.
- I propose that we use the same setup as we agreed upon for the statistics calculation in mllam-dataprep (see Add support for writing more composite statistics (e.g. grid-point based mean of time-step differences) mllam-data-prep#42). That is, we define what verification methods we want to calculate in the "methods" section. We verify that those methods can be imported from within the package when parsing the config, if not script will fail.
- I thought that we would not only be interested in saving plots to disk, but also datasets with the verification metrics. For now, I've just added a "path" parameter to the "output" section, so I will just save the plots and the verification datasets to the same path. We can elaborate this if needed, e.g. what variables we want to save etc.
Metadata
Metadata
Assignees
Labels
No labels