Skip to content

Validate granules using UMM_JSON metadata from CMR #2737

@jtherrmann

Description

@jtherrmann

Jira: https://asfdaac.atlassian.net/browse/TOOL-3662

Note: The above link is accessible only to members of ASF.


The _get_cmr_metadata function in hyp3_api.validation currently queries https://cmr.earthdata.nasa.gov/search/granules.json as defined by CMR_URL here, and then further filters the fields included for each granule before returning the metadata.

Recently #2739 motivated us to consider querying https://cmr.earthdata.nasa.gov/search/granules.umm_json instead, to get the UMM_JSON metadata, which includes more information than the JSON metadata.

@asjohnston-asf also points out here that using umm_json would simplify some of our other validators as well:

having access to the additional attributes simplifies check_same_relative_orbits via PATH_NUMBER and check_single_burst_pair via POLARIZATION

and the new check_rtc_static_coverage via BURST_ID_FULL

It shouldn't be too difficult to make the conversion, but unfortunately the fields available in umm_json are not a superset of the fields available in json, so it will require at least updating _get_cmr_metadata to filter for different fields, e.g. this is what currently gets returned from that function (as of v10.4.1):

    return [
        {
            'name': entry.get('producer_granule_id', entry.get('title')),
            'polygon': Polygon(_format_points(entry['polygons'][0][0])),
        }
        for entry in response.json()['feed']['entry']
    ]

so we'll have to find the equivalent of producer_granule_id, title, and polygons in the umm_json. But I'd probably lean toward removing the custom field filtering logic from _get_cmr_metadata and just providing the full umm_json CMR item for each granule to the validators, so that _get_cmr_metadata doesn't have to care about e.g. choosing between producer_granule_id and title for the granule name.

Even though it's probably a fairly straightforward refactor, I'm not confident in our ability to do this without introducing regression bugs for our existing job validators, especially since we don't have comprehensive integration tests that actually query CMR. So I would consider this issue blocked by #2740.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Jira TaskCreate a Jira Task for this issue

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions