Skip to content

removing advanced search #13

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

rhysrevans3
Copy link
Contributor

Advanced and simple free text search are not compatible. I think advanced search is closer to Filter Extension so should be moved to either the filter extension or it's own extension of the filter extension.

#10

@vincentsarago
Copy link

can we merge this ? @m-mohr

@m-mohr
Copy link
Collaborator

m-mohr commented Jul 7, 2025

Has this been moved somewhere already? Removing without replacement or deprecation seems not like the right approach.

@vincentsarago
Copy link

Well my understanding is that advanced is not well described anyway (or at least I can't find where it is)

I'll let @rhysrevans3 jump in for this one

@rhysrevans3
Copy link
Contributor Author

I haven't moved this yet as I'm still unsure of the best place for it. The two options I came up with were:

  1. A separate "advanced search" extension. This could use a different field from q so both could be implemented in the same API.
  2. Adding it to the filter extension as a extra query language. Because it adds more complex querying like field comparison I thought it might be closer to the filter extension? But I wasn't sure if this would be added to the current filter extension or a separate extension that extends the filter extension (not sure if that's possible).

I agree with @vincentsarago advanced search could be more well described (I'd like to follow the Lucene query syntax but that might be my elasticsearch bias showing). I'm happy to do the move if there's a consensus on where it should be moved to.

@m-mohr
Copy link
Collaborator

m-mohr commented Jul 9, 2025

Filter is not a good idea as that's meant to be aligned with OGC APIs.

Just to be clear this is pretty much two extensions in one already. There are different conformance classes so you could literally just copy the repository and delete everything for basic in the new repository and remove everything for advaned from this repository. That would be the simplest way for now to just separate them. What to do with advanced afterwards, I'm not sure.

@rhysrevans3
Copy link
Contributor Author

I don't think adding an extra filter language would stop the filter extension from being aligned with OGC APIs. They actually explicitly allow extra filtering languages.

Servers that support other filtering languages can extend this list of values as necessary although the meanings of any additional values are not described in this Standard.

from OGC API - Features - Part 3: Filtering - 8.3 Parameter filter-lang

@m-mohr
Copy link
Collaborator

m-mohr commented Jul 9, 2025

Yeah, but then instead of a slightly convoluted free-text search extension we'd have a even more convoluted filter extension. I don't see why this would be desirable. Given the status of the advanced free text, I'd noestly just create a separate extension for it, if it is deemed useful to split them from the basic free text.

@rhysrevans3
Copy link
Contributor Author

rhysrevans3 commented Jul 10, 2025

Sorry I got the wrong end of the stick for your last message. You're right that was the thought behind "an extension to the filter extension" (a separate extension that extends the filter extension) so it would leave the core filter extension alone.

That was also part of the reason for suggesting this move to allow free text search to also be fully aligned with the OGC API's free text search. But as you said these are practically separate extensions already so probably not as big a concern as I thought.

The other reason was to allow APIs the option to do both basic and advanced (as long as a different parameter was used). So they could stay in line with the OGC and have the extra functionality of advanced. But maybe that's more confusing than it's worth.

Reading your's and @fmigneault (stac-utils/stac-fastapi#849) opinions on the move I'm happy to leave advanced search in this extension and close this pull if that's the consensus.

Possibly a separate issue but would people be happy to have advanced free text search follow Lucene query syntax just to make it more well defined? A note could be added to the readme.

@m-mohr
Copy link
Collaborator

m-mohr commented Jul 10, 2025

I think we should separate the various different discussions that are going on. For me the roadmap would be:

  1. Split Basic Free Text Search and Advanced Free Text search into two repositories (no functional change as techhically they are already separate due to the different conformance classes). This way we can more easily evolve and version them. (I think this makes sense to do, but this discussion was very convoluted with many issues so we probably misunderstood each other.)
  2. Release Basic 1.0.0, no changes are to be expected as Records is released now.
  3. Discuss how to proceed with Advanced Free Text Search (ideally to be discussed not here, but in issues in the new repo).
    • Something to be discussed would be whether two filter languages can be used at the same time. Currently, you can use the q parameter in addition to the Filter extension, which I think is useful. Using two free-text flavours at the same time seems not so useful for me.
    • Do we keep the advanced extension as is because there seem to be implementations for it?
    • Do we change it? If yes, how? And would that be a new version or a new extension? For example, if the evolution would be Lucene query syntax, I think it would feel more natural to create a "Lucene Free Text Search" extension (although not a big fan of binding to a specific implementation syntax) and potentially deprecate Advanced Free Text.

@fmigneault
Copy link

I think having separate extensions makes sense if the Advanced Free Text Search can be complementary/extends the Basic one.

Right now, they are conflicting by using the same q parameter and distinct list[str]/str variants in HTTP POST. Ideally, one could have both extensions enabled simultaneously and the server would handle either representation transparently. Since Basic uses , to separate terms, while Advanced uses explicit AND/OR and provides " to disambiguate (eg: "explicit,comma" OR something-else), I think the STAC API could support both seamlessly.

The Lucene syntax IMO is closer to a filter-lang variant because it can target specific fields, contrary to q Free Text Search that looks everywhere at the same time.

@m-mohr
Copy link
Collaborator

m-mohr commented Jul 10, 2025

I think having separate extensions makes sense if the Advanced Free Text Search can be complementary/extends the Basic one.

Shouldn't it be the opposite? If advanced extends the basic one, they should be together. If they cinflict, they should be separate. 🤔

@fmigneault
Copy link

Feels more like a extension structure/administration concern. Extensibility and conflicts are both hard to track if too disjoint, so proper references must be given either way (similar to POST /collections for Transactions vs Collection Search for example).

Another way to refer them could be that the conformance class of Advanced inherits from Basic, so it is a requirement that both are referenced and handled if Advanced is implemented.

Copy link

@jsignell jsignell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @m-mohr that just deleting content from the readme isn't very user-friendly. There should be some kind of deprecation cycle and a clear path of what people should be migrating to.

@fmigneault
Copy link

I am looking forward to implementing and support it once the issue I encountered was fixed (stac-utils/stac-fastapi#849). Preferably, it is not unnecessarily deprecated unless this deprecation comes with a direct replacement. I see little/no advantage in separating/moving them though, since each basic/advanced variant can already be identified by their distinct conformance URIs even if they are maintained under a common repository. Leaving them together avoids repeating concepts that are shared between the two.

@m-mohr
Copy link
Collaborator

m-mohr commented Jul 14, 2025

Versioning would be simpler if they are split, including a potential deprecation of Advanced.

@fmigneault
Copy link

fmigneault commented Jul 14, 2025

If split and Advanced moves to a new separate version, then I suggest we take that versioning opportunity to integrate my proposal during the community meeting, that is, that Advanced is adjusted to work on top of Basic rather than conflicting with it.

The only adjustments would be:

  • allow , as alias of OR and (space)
  • allow HTTP POST "q":["A","word"] as equivalent to "q": "A word" to match the OR aliases of either Basic/Advanced
  • make it clear that explicit "" are needed to disambiguate cases were ,, - or (space) are used as part of the word
    (note that Basic search technically does not handle explicit search with commas, eg.: X-Dataset,"Surname, Name")

This way would allow APIs to support both variants simultaneously, and users would get natural responses with whichever typical "anyOf" text representation they submit.

@fmigneault
Copy link

@m-mohr

  • Do we keep the advanced extension as is because there seem to be implementations for it?

I did some extra digging into the code, and I found that it is implemented in stac-utils/pgstac#295.
Using the appropriate extensions in stac-fastapi+stac-fastapi-pgstac, I am able to perform the corresponding queries for the advanced extension.

However, q it is not "completely" supported for all variants of free-text. Some endpoints are still missing the parameter. This PR (stac-utils/stac-fastapi-pgstac#267) adds them and I can validate they are working on my end when everything is configured properly.

@m-mohr
Copy link
Collaborator

m-mohr commented Jul 17, 2025

I'm closing here, this is something for an issue, not a PR.
I think we've concluded already that just removing it is not the right approach, it needs at least a deprecation period.
Alternative could be a split or other changes, but that should be discussed separately in an issue.
=> #15

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants