Case-Insensitive Similarity Search in ChromaDB for RAG Pipeline #31015
Adarsh-AMT
announced in
Ideas
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Checked
Feature request
Hi everyone,
We're currently working on a RAG (Retrieval-Augmented Generation) pipeline using ChromaDB and GPT-4o, and we’ve run into a case sensitivity issue during similarity search.
We're using the following code to perform the search:
vector_similarity_search = chroma_initializer.max_marginal_relevance_search(
query=question,
k=k,
fetch_k=2 * k, # Ensuring fetch_k isn't too large
lambda_mult=0.6, # Adjusting lambda for better relevance balance
where_document=where_document_filter,
)
The issue is that when the query contains a lowercase word like "ahmad", it doesn’t match a capitalized version like "Ahmad" in the documents. We want the similarity check to be case-insensitive so that matches aren't missed due to casing.
Is there any built-in parameter in ChromaDB that supports case-insensitive search without affecting performance?
Motivation
.
Proposal (If applicable)
No response
Beta Was this translation helpful? Give feedback.
All reactions