Skip to content

Update catalogs for xscen #68

@aulemahal

Description

@aulemahal

Original issue:
intake-esm uses cat.esmcat.aggregation_control.variable_column_name to configure which column stores which variables are in which entry. When it is set, a search like : cat.search(variable=['tas']).to_dataset_dict() will only return tas, and not other variables in the same files. Also, it makes possible the use of the DerivedVariableRegistry with which we can convert data on-the-fly. (Ex: dtr from tasmin and tasmax).

The current intake-esm (even in master) will break if aggregation_control is not given. Also, a fix needs to be implemented to support OpenDAP links. But even if we fix those (see my PR on intake-esm), we are excluding the
PAVICS catalog from useful features by not setting this field.

EDIT: Two PRs on intake-esm have been made:

  • Aggregation control is now optional
  • Format== "opendap" is supported

However, the current catalogs have a few others caveats.

Dataset "ID"

Intake_esm won't build the dataset "keys" with fields from columns with both NaNs and values. When "aggregation control" is not given, keys are built by concatenating all columns. Thus, to work with intake-esm, pavics' catalogs must have values for each entry and column. For example, this is isn't true for the "biasadjusted" catalog, where driving_institution is empty for some datasets. Thus, we need to either fill the columns or to have another column acting as a complete dataset id (like xscen does). The current dataset_id does not contain the driving_institution information, so if it is used as a key, intake will receive multiple assets without knowing how to merge them.

xscen

Overall, the catalogs are not easy to work with with xscen. AFAIU, the current catalogs are not used by anyone ? If so, I suggest we copy the xscen vocabulary (column names), allowing an easier interaction, without losing human-readable information. This might necessitate some complex attribute parsing though, as the ncmls on pavics might not carry those attribute as-is.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or requestquestionFurther information is requested

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions