Skip to content

Conversation

@pjhartzell
Copy link
Collaborator

@pjhartzell pjhartzell commented Jan 7, 2025

Related Issue(s):

Proposed Changes:

Minimum implementation of the filter extension.

  • Basic CQL2
  • CQL2 JSON (no CQL Text)
  • Available to the /search (GET and POST), /collections/{collectionId}/items (GET), and /aggregate (GET) endpoints

PR Checklist:

  • I have added my changes to the CHANGELOG or a CHANGELOG entry is not required.

@pjhartzell pjhartzell marked this pull request as ready for review January 14, 2025 14:49
@philvarner philvarner self-requested a review January 14, 2025 15:45
@rkm-ursa
Copy link

The current filter extension behavior requires using the properties prefix. I think we should omit this because it leads to a divergence between the advertised queryables as used by the search vs the filter extensions (e.g. search would use myField and filter would require properties.Myfield).

I think we should provide special handling for the subset of non-property fields that are available for querying on items (id, geometry and collection) and all other properties should be considered to reside under the item properties implicitly and thus not require a prefix when using the filter extension.

This maintains consistency between the two methods for searching and matches the examples given by the filter extension spec.

@philvarner
Copy link
Collaborator

I think this comment/suggestion addresses @rkm-ursa 's concern also. I agree that users shouldn't be required to prefix all of their terms with properties..

I think the piece that's missing here is including in the queryables specification a mapping from the queryable terms to the document fields, e.g., mapping geometry to geometry; mapping cloud_cover to `properties.eo:cloud_cover. As it is now, it requires that users configure their queryables to be the same as the document fields (e.g., the STAC Item fields). The intention of Queryables in CQL2 is to decouple the actual documents / Items from the terms used to query them.

For example, in the implementation in this PR, a properly-specified queryables schema that supports filtering on cloud cover would look like:

{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "$id": "https://example.com/v1/collections/foo/queryables",
  "type": "object",
  "title": "Queryables for Collection foo",
  "properties": {
    "properties.eo:cloud_cover": {
      "$ref": "https://stac-extensions.github.io/eo/v1.0.0/schema.json#/definitions/fields/properties/eo:cloud_cover"
    }
  },
  "additionalProperties": true
}

instead of the (better) property definition of:

    "cloud_cover": {
      "$ref": "https://stac-extensions.github.io/eo/v1.0.0/schema.json#/definitions/fields/properties/eo:cloud_cover"
    }

One possibility is to specify this in the schema like:

    "cloud_cover": {
      "$ref": "https://stac-extensions.github.io/eo/v1.0.0/schema.json#/definitions/fields/properties/eo:cloud_cover",
      "mapping": "properties.eo:cloud_cover"
    }

and remove it before serving the /queryables endpoint so that it's valid JSON Schema (since JSON Schema doesn't allow arbitrary extra fields OOTB).

Or, there could just be a separate mapping document attached to the Collection like there is for the queryables that has only the mappings in it, e.g.,:

{
  "type": "Collection",
  "id": "sentinel-2-c1-l2a",
  "queryables": { ... },
  "queryable_term_mappings": {
    "geometry": "geometry"
    "cloud_cover": "properties.eo:cloud_cover"
  }
...
}

I think there also needs to be checking of the terms if the queryables JSON Schema has "additionalProperties": true, so that only the terms defined are used in the expression.

From OGC API Part 3: Filtering:

The additionalProperties member with a value of true or false is used to state the behavior with respect to properties that are not explicitly declared in the queryables schema.

If additionalProperties is missing or has the default value true, any property name is valid in a filter expression and the property reference SHALL evaluate to null, if the property does not exist for a resource.

If additionalProperties is set to false, property references that are not explicitly declared in the queryables schema SHALL result in a 400 response.

@pjhartzell
Copy link
Collaborator Author

pjhartzell commented Jan 15, 2025

@philvarner In response to your comment:

I think this comment/suggestion addresses @rkm-ursa 's concern also. I agree that users shouldn't be required to prefix all of their terms with properties..

Understood. I had initially implemented it this way and have no issue with reverting.

I think the piece that's missing here is including in the queryables specification a mapping from the queryable terms to the document fields...

Yes. I'm going to go with your first suggestion, which is to include the mappings directly in the queryable schema for each property and remove them before serving the queryable endpoint.

I think there also needs to be checking of the terms if the queryables JSON Schema has "additionalProperties": true, so that only the terms defined are used in the expression.

Yes. I'll implement that.

UPDATE on 1/16/24: Striking the conversation about queryable mappings and property validation if additionalProperties is false. As discussed with @philvarner offline, we will defer queryable mapping and property validation until the filter extension is more fully implemented in subsequent PRs.

@pjhartzell
Copy link
Collaborator Author

pjhartzell commented Jan 15, 2025

@philvarner Each collection defines its own queryables (and mappings). What is the intended source for the collections from which to pull the queryables (and mappings)?
1. The collections array from item search?
2. Any collection fields specified in the cql2 filter?
3. The combination of items 1 and 2 above?

Regardless of the choice, do we use a union or intersection of queryables when multiple collections are specified?

UPDATE on 1/16/24: Striking this comment about how to gather collections for the purpose of building up the queryable mappings. As discussed with @philvarner offline, we will defer queryable mapping and property validation until the filter extension is more fully implemented in subsequent PRs.

@pjhartzell pjhartzell closed this Jan 15, 2025
@pjhartzell pjhartzell reopened this Jan 16, 2025
@pjhartzell
Copy link
Collaborator Author

pjhartzell commented Jan 16, 2025

@philvarner Another question. While I don't mind implementing some "advanced" queryable functionality if we are able to come to a reasonable answer to my prior question, are we getting ahead of ourselves? The filter extension recommends starting with:

It recommended that implementers start with fully implementing only a subset of functionality. A good place to start is implementing only the Basic CQL2 conformance class of logical and comparison operators, defining a static Queryables schema with no queryables advertised and the additionalProperties field set to true, and only implementing CQL2 JSON.

...which is what this PR currently provides (less the forthcoming change to remove the requirement for fully qualified property names, i.e., remove the requirement for the properties prefix).

My intent was to follow up with another PR with the next recommended action:

Following from that can be support for S_INTERSECTS, defining a static Queryables schema with only the basic Item properties, and implementing CQL2 Text.

And then an additional PR with Advanced Comparison Operators, which follows the next recommended action:

From there, other comparison operators can be implemented...

And then, finally, look at:

...and a more dynamic Queryables schema


UPDATE on 1/16/24: As discussed with @philvarner offline, we will follow the sequence above (which is taken directly from the filter extension README) in fleshing out the implementation of the filter extension. This PR will be limited to "...only the Basic CQL2 conformance class of logical and comparison operators, defining a static Queryables schema with no queryables advertised and the additionalProperties field set to true, and only implementing CQL2 JSON"

@pjhartzell pjhartzell requested a review from philvarner January 19, 2025 14:52
@pjhartzell pjhartzell requested a review from philvarner January 23, 2025 18:29
@philvarner philvarner added this pull request to the merge queue Jan 24, 2025
Merged via the queue into stac-utils:main with commit c692a0a Jan 24, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants