Skip to content

fix: ensure exact:true matches entire property value, not tokens #942

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
38 changes: 35 additions & 3 deletions packages/orama/src/methods/search-fulltext.ts
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
import { getFacets } from '../components/facets.js'
import { getGroups } from '../components/groups.js'
import { runAfterSearch, runBeforeSearch } from '../components/hooks.js'
import { getInternalDocumentId } from '../components/internal-document-id-store.js'
import { getInternalDocumentId, InternalDocumentID } from '../components/internal-document-id-store.js'
import { Language } from '../components/tokenizer/languages.js'
import { createError } from '../errors.js'
import type {
Expand All @@ -14,7 +14,7 @@ import type {
TokenScore,
TypedDocument
} from '../types.js'
import { getNanosecondsTime, removeVectorsFromHits, sortTokenScorePredicate } from '../utils.js'
import { getNanosecondsTime, removeVectorsFromHits, sortTokenScorePredicate, getNested } from '../utils.js'
import { count } from './docs.js'
import { fetchDocuments, fetchDocumentsWithDistinct } from './search.js'

Expand Down Expand Up @@ -66,7 +66,39 @@ export function innerFullTextSearch<T extends AnyOrama>(
// in this case, we need to return all the documents that contains at least one of the given properties
const threshold = params.threshold !== undefined && params.threshold !== null ? params.threshold : 1

if (term || properties) {
/**
* Property-value exactness:
* If `params.exact` is true and a search term is provided, iterate all documents and check if any of the specified properties
* (or all string properties if none specified) match the search term exactly (case-insensitive).
* This is different from token-based exactness, which is handled by `exactToken`.
*
* Example:
* - If a document property value is "First Note.md" and the search term is "first" with `exact: true`,
* it will NOT match (property-value is not exactly "first").
* - If a document property value is "first" and the search term is "first" with `exact: true`,
* it WILL match.
*
* For token-based exactness (matching individual words/tokens), use `exactToken` instead.
*/
if (params.exact && term) {
const docs = orama.documentsStore.getAll(orama.data.docs) as Record<InternalDocumentID, TypedDocument<T>>
const normalizeTerm= term.toLowerCase()

uniqueDocsIDs = Object.entries(docs)
.filter(([, doc]) => {
return propertiesToSearch.some((prop) => {
const value = getNested(doc, prop)
if (typeof value === 'string') {
return value.toLowerCase() === normalizeTerm
}
if (Array.isArray(value)) {
return value.some((v) => typeof v === 'string' && v.toLowerCase() === normalizeTerm)
}
return false
})
})
.map(([id,]) => [+id, 0] as TokenScore)
} else if (term || properties) {
const docsCount = count(orama)
uniqueDocsIDs = orama.index.search(
index,
Expand Down
19 changes: 18 additions & 1 deletion packages/orama/src/types.ts
Original file line number Diff line number Diff line change
Expand Up @@ -314,10 +314,27 @@ export interface SearchParamsFullText<T extends AnyOrama, ResultDocument = Typed
sortBy?: SortByParams<T, ResultDocument>

/**
* Whether to match the term exactly.
* Whether to match the term exactly against the entire property value (case-insensitive).
*
* If true, only documents where the specified property (or all string properties if none specified)
* matches the search term exactly (as a whole string, not tokenized) will be returned.
*
* Example:
* - If a property value is "First Note.md" and the search term is "first", this will NOT match.
* - If a property value is "first" and the search term is "first", this WILL match.
*
* For token-level exactness (matching individual words/tokens), use `exactToken` instead.
*/
exact?: boolean

/**
* Whether each token should be matched exactly (token-level exactness).
*
* If true, only documents where individual tokens (words) in the property match the search term exactly will be returned.
* This is different from `exact`, which matches the entire property value.
*/
exactToken?: boolean
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this props used?


/**
* The maximum [levenshtein distance](https://en.wikipedia.org/wiki/Levenshtein_distance)
* between the term and the searchable property.
Expand Down