Skip to content

Conversation

@sudha-murthy
Copy link
Contributor

@sudha-murthy sudha-murthy commented Oct 17, 2025

Description

This corrects the edge cases where the spatially subsetted outputs in LAEA projected grids may produce outputs beyond the granule spatial extents.

Jira Issue ID

DAS-2424

Local Test Steps

  • Fetch the local branch and run
    bin/build-image & bin/build-test and bin/run-test
    All the unit tests should pass.
  1. Compare the behavior of the following test case in UAT versus in harmony in a box
    https://harmony.uat.earthdata.nasa.gov/C1268399578-EEDTEST/ogc-api-coverages/1.0.0/collections/parameter_vars/coverage/rangeset?forceAsync=true&granuleId=G1268399648-EEDTEST&subset=lat(-78.65179%3A-58.78991)&subset=lon(-89.55597%3A-40.16022)&variable=Soil_Moisture_Retrieval_Data_AM%2Fsurface_flag&variable=Soil_Moisture_Retrieval_Data_PM%2Fsurface_flag_pm&variable=Soil_Moisture_Retrieval_Data_Polar_AM%2Fsurface_flag&variable=Soil_Moisture_Retrieval_Data_Polar_PM%2Fsurface_flag_pm&skipPreview=true
    The above test case would show valid spatial outputs for the global grids and incorrect spatial outputs for the polar grids
ValidSpatialOutputGlobalGrid IncorrectSpatialOutput
  1. Re-run the request with harmony in a box with just hoss and without mask fill or the annotator.
    http://localhost:3000/C1268399578-EEDTEST/ogc-api-coverages/1.0.0/collections/parameter_vars/coverage/rangeset?forceAsync=true&granuleId=G1268399648-EEDTEST&subset=lat(-78.65179%3A-58.78991)&subset=lon(-89.55597%3A-40.16022)&variable=Soil_Moisture_Retrieval_Data_AM%2Fsurface_flag&variable=Soil_Moisture_Retrieval_Data_PM%2Fsurface_flag_pm&variable=Soil_Moisture_Retrieval_Data_Polar_AM%2Fsurface_flag&variable=Soil_Moisture_Retrieval_Data_Polar_PM%2Fsurface_flag_pm&skipPreview=true
    The outputs in Panoply should not show any values in the area outside the requested spatial request or outside the granule extent.
    The polar grid plots in panoply should not have the invalid spatial output show above in UAT. There is no real image displayed since the values are all fill.
    If you check the hoss logs. The index ranges for the polar grid will be close to edge of the grid.
    variables_with_ranges:
    /Soil_Moisture_Retrieval_Data_Polar_AM/longitude[1927:1999][0:72], /Soil_Moisture_Retrieval_Data_Polar_AM/surface_flag[1927:1999][0:72], /Soil_Moisture_Retrieval_Data_Polar_PM/surface_flag_pm[1927:1999][0:72], /Soil_Moisture_Retrieval_Data_Polar_PM/latitude_pm[1927:1999][0:72], /Soil_Moisture_Retrieval_Data_Polar_AM/latitude[1927:1999][0:72], /Soil_Moisture_Retrieval_Data_Polar_PM/longitude_pm[1927:1999][0:72],

  2. Re-run the request with harmony in a box with mask fill and annotator.
    http://localhost:3000/C1268399578-EEDTEST/ogc-api-coverages/1.0.0/collections/parameter_vars/coverage/rangeset?forceAsync=true&granuleId=G1268399648-EEDTEST&subset=lat(-78.65179%3A-58.78991)&subset=lon(-89.55597%3A-40.16022)&variable=Soil_Moisture_Retrieval_Data_AM%2Fsurface_flag&variable=Soil_Moisture_Retrieval_Data_PM%2Fsurface_flag_pm&variable=Soil_Moisture_Retrieval_Data_Polar_AM%2Fsurface_flag&variable=Soil_Moisture_Retrieval_Data_Polar_PM%2Fsurface_flag_pm&skipPreview=true

The request will succeed in hoss, but will fail in mask fill due to lat/lon being all fill values to be addressed in DAS-2399
WorkItem failed: sds/maskfill-harmony:latest: MaskFill failed with error: /Soil_Moisture_Retrieval_Data_Polar_AM/longitude or /Soil_Moisture_Retrieval_Data_Polar_AM/latitude have no valid data.

PR Acceptance Checklist

  • [ X] Jira ticket acceptance criteria met.
  • [ X] CHANGELOG.md updated to include high level summary of PR changes.
  • [ X] docker/service_version.txt updated if publishing a release.
  • [X ] Tests added/updated and passing.
  • Documentation updated (if needed).

@sudha-murthy sudha-murthy requested review from a team and D-Auty October 17, 2025 18:25
- Updates determination of projected min-max in the bounding box calculations to correct
edge cases where the area outside the granules gets included for lambert-azimuthal
grid projections. An exception is thrown for invalid spatial extent ranges.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Updates evaluation of bbox or polygon constraint to exclude areas outside the target grid (projected). A warning exception occurs if spatial constraint is entirely outside grid - No data found.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The warning exception change has not been implemented in this ticket. It is a future merge.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An error exception... is more correct for now then. Thanks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated - 7cab64e

'y_min': np.min(y_values),
'y_max': np.max(y_values),
}
# Remove any points that are outside the grid and are invalid before
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could the min/max deletion logic be reused in another method in the future? If so, I’d suggest moving it into its own helper function

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure. I think we could move it when we have another use case that would re-use it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, It should be a separate function, because the test are hard to understand. If this was a separate function you could write clear tests that describe exactly what's going on in this part of the code below and we could verify that it works without having to trust

This all looks very inefficient. You're calling np.min() np.max() again and again on the same values in here.

Also np.delete creates a new array each time it's called. so you're doing that 8 times in here. could this be simplified somehow?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The repeated calling of np.min() and np.max(). - will fix
np.delete..- let me see if I can create a mask and delete

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if you have good tests around that part first, it will be easier to refactor the code to be more efficient.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated - 93198a6

Copy link
Contributor

@joeyschultz joeyschultz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good. The unit tests all passed. I compared the output of the UAT request vs the localhost request (without maskfill) and see the difference in the subset area. I requested that Sudha update the test instructions to help me confirm the subset area is correct.

Once the comments below are addressed and I can confirm the subset area of the new output, I'll be able to mark this as approved.

Copy link
Member

@flamingbear flamingbear left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to see the logic portion separated into a new function and tests so that it's easier to understand.

'y_min': np.min(y_values),
'y_max': np.max(y_values),
}
# Remove any points that are outside the grid and are invalid before
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, It should be a separate function, because the test are hard to understand. If this was a separate function you could write clear tests that describe exactly what's going on in this part of the code below and we could verify that it works without having to trust

This all looks very inefficient. You're calling np.min() np.max() again and again on the same values in here.

Also np.delete creates a new array each time it's called. so you're doing that 8 times in here. could this be simplified somehow?

Comment on lines 400 to 405
expected_output = {
'x_min': -12702459.818865139,
'x_max': 12702459.818865139,
'y_min': -12702440.623773243,
'y_max': 12702440.710450241,
'x_min': -8982000,
'x_max': 8982000,
'y_min': -8982000,
'y_max': 8982000,
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did these change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is the fix. The ranges > 900000 are outside the grid.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So what happens with values that are outside the grid?

what should happen with?

     x_values = np.linspace(-9200000, 9200000, 500)
     y_values = np.linspace( 9200000, -9200000, 500)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would remove points outside that range +/- 9200000

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you try it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't check if they are valid. If the return from the transform is infinity or NaNs , I am assuming they are not valid for that crs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I can add a check

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+/- 9000000/2000 is the master geotransform for this specific granule, If we use different values, not sure what we accomplish in the test - other than that we passed in incorrect dimensions that don't match the granule?
But I can add a check for NaNs and infinity and raise a proper exception if no values are valid.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I think you should just raise an error if any of those grid_lats grid_lons are Nan and be done.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated - af686f7

Comment on lines 1195 to 1196
x_values = np.linspace(-9000000, 9000000, 2000)
y_values = np.linspace(9000000, -9000000, 2000)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why did you choose these values? are they important to the test?
I can change the values and the tests still pass.

x_values = np.linspace(-9000000, 9802000, 2000)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

those 2 parameters were added to the function. Not sure how it would build without those values...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+/-9000000 is the extent of the SPL3SMAP_E Polar EASE Grid

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why did you choose these values? are they important to the test? I can change the values and the tests still pass.

x_values = np.linspace(-9000000, 9802000, 2000)

It is probably passing because the code only checks for min.max for the extents. If you passed in a smaller extent, the test will fail.

Comment on lines +443 to +448
expected_output = {
'x_min': -8993061.78423412,
'x_max': -8350580.505440015,
'y_min': -8997181.591145469,
'y_max': -8354987.361637551,
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again this is the problem with mixing the logic of projections with clipping in the code. I can't guess why these numbers are here. Can you separate out the logic into a new function and write tests around that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do need to check the extents which is the returned from this function. Let me revisit it after I break down function..

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those are the extents that would be returned if the given bbox is converted to LAEA projection and all the points outside the granule extent (from the dimension scales) is removed.

Copy link
Contributor

@joeyschultz joeyschultz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The updates look good. I have a few suggestions below that I think could improve the readability.

Copy link
Member

@flamingbear flamingbear left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is much better, I like the functions that do one thing, and it makes the tests much easier for me to understand. Thanks.

I think you should fix the return from get_x_y_extents_from_geographic_perimeter though.

And I think there are some better names you could choose.

Comment on lines 400 to 405
expected_output = {
'x_min': -12702459.818865139,
'x_max': 12702459.818865139,
'y_min': -12702440.623773243,
'y_max': 12702440.710450241,
'x_min': -8982000,
'x_max': 8982000,
'y_min': -8982000,
'y_max': 8982000,
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So what happens with values that are outside the grid?

what should happen with?

     x_values = np.linspace(-9200000, 9200000, 500)
     y_values = np.linspace( 9200000, -9200000, 500)

'y_min': np.min(y_values),
'y_max': np.max(y_values),
}
# Remove any points that are outside the grid and are invalid before
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if you have good tests around that part first, it will be easier to refactor the code to be more efficient.

Copy link
Contributor

@joeyschultz joeyschultz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates Sudha. I re-ran the unit tests and they all passed:

====================== 156 passed, 353 warnings in 4.56s =======================

I also ran the request in HIAB, confirmed the expected index ranges in the HOSS logs, and that the outputs in panoply no longer have the invalid values.

Comment on lines 462 to 463
print(f'outputs={outputs}')
print(f'expected_output={expected_output}')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can delete these print statments?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done - da880a3

with self.assertRaises(InvalidGranuleDimensions):
x_values1 = np.linspace(-9200000, 9200000, 500)
y_values1 = np.linspace(9200000, -9200000, 500)
get_projected_x_y_extents(x_values1, y_values1, crs, bounding_box=bbox)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice. 👍

Copy link
Member

@flamingbear flamingbear left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that does it. Thanks for hanging in there with all the changes. I think this is in better shape.

@sudha-murthy sudha-murthy merged commit c6e82c0 into main Oct 21, 2025
4 checks passed
@sudha-murthy sudha-murthy deleted the DAS-2424 branch October 21, 2025 21:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants