-
-
Notifications
You must be signed in to change notification settings - Fork 781
Description
I am working on a project that uses full-text search as part of a larger query language, and recently, I started to consider migrating from Lucene to Tantivy for the text search component.
Since full-text search is only part of our query language, we often encounter the scenario where users write a generic keyword query (say *a*
), but we already know from other (non-text) operators in the query that eventually, only a small subset of documents is relevant.
Therefore, my question is: Can I run queries only on a subset of documents based on their document ID in tantivy?
Based on my initial documentation search, I was wondering if the FilterCollector would be relevant here. However, I could not find much information on what I can pass as a TPredicate
to the filter.
As a follow-up question: Would using a FilterCollector in combination with a subsequent TopDocs collector for the documents that are not filtered out have a performance benefit?