Skip to content

Testing integration of daops as backend for cmip6_preprocessing #55

@jbusecke

Description

@jbusecke

Hi everyone,

I firstly wanted to say thank you for all the efforts that have already been put into this framework. I would love to contribute/integrated daops more into my workflow.

I am maintaining cmip6_preprocessing and am very interested in migrating some of the things I fix (in a quite ad-hoc fashion for now) in a more general way over here.

My primary goal for cmip6_preprocessing is to use it with python and the scientific pangeo stack, but I like the idea of documenting the actual problems (needing 'fixes') in an general and language-agnostic way over here. I was very impressed by the demo @agstephens gave a while ago during the CMIP6 cloud meeting and am now thinking of finally getting to work on this.

I am still really unsure how to actually contribute fixes to this repo, though. What I propose is to work my way through this using some quite simple fixes that are relatively easy to apply and are already documented in errata.

Specifically, I am currently testing this python code, which changes some of the metadata necessary to determine the point in time where a dataset was branched of the parent model run.

def fix_metadata_issues(ds):
    # https://errata.es-doc.org/static/view.html?uid=2f6b5963-f87e-b2df-a5b0-2f12b6b68d32
    if ds.attrs["source_id"] == "GFDL-CM4" and ds.attrs["experiment_id"] in [
        "1pctCO2",
        "abrupt-4xCO2",
        "historical",
    ]:
        ds.attrs["branch_time_in_parent"] = 91250
    # https://errata.es-doc.org/static/view.html?uid=61fb170e-91bb-4c64-8f1d-6f5e342ee421
    if ds.attrs["source_id"] == "GFDL-CM4" and ds.attrs["experiment_id"] in [
        "ssp245",
        "ssp585",
    ]:
        ds.attrs["branch_time_in_child"] = 60225
    return ds

These ingest an xarray.dataset and check certain conditions within the attributes, and then overwrite attributes accordingly. I could easily parse those out to dataset-specific 'fixes'.

Where exactly could I translate this into a fix within the daops framework? Very happy to start a PR (and then test the implementation from cmip6_preprocessing), but I am afraid I am still a bit unsure about the daops internals. Any pointers would be greatly appreciated.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions