Replies: 1 comment 2 replies
-
🤖 Hey @iPablo! Great to see you diving into another challenge. Hope you're doing well! Based on the context provided, it seems like you're experiencing a performance issue when filtering documents marked as "QA" in your Chroma database. Here are a few strategies you might consider to improve the efficiency of your filter:
Regarding the def similarity_search_with_score(
self,
query: str,
k: int = DEFAULT_K,
filter: Optional[Dict[str, str]] = None,
where_document: Optional[Dict[str, str]] = None,
**kwargs: Any,
) -> List[Tuple[Document, float]]:
"""Run similarity search with Chroma with distance.
Args:
query (str): Query text to search for.
k (int): Number of results to return. Defaults to 4.
filter (Optional[Dict[str, str]]): Filter by metadata. Defaults to None.
Returns:
List[Tuple[Document, float]]: List of documents most similar to
the query text and cosine distance in float for each.
Lower score represents more similarity.
"""
if self._embedding_function is None:
results = self.__query_collection(
query_texts=[query],
n_results=k,
where=filter,
where_document=where_document,
)
else:
query_embedding = self._embedding_function.embed_query(query)
results = self.__query_collection(
query_embeddings=[query_embedding],
n_results=k,
where=filter,
where_document=where_document,
)
return _results_to_docs_and_scores(results) In this function, the Please note that these are general approaches and their effectiveness can vary based on the specifics of your application and data. Always make sure to test any changes thoroughly to ensure they are improving performance as expected. You can find more details about the Sources
|
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Checked other resources
Commit to Help
Example Code
Description
I am making an api using FastAPI and langchain to answer a user Q&A. Doing performance tests I have realized that I have a bottleneck in the access to Chroma. At one point in the application I use the filter to only query documents marked as QA.
At that point the requests start to queue and drastically trigger the response time. Is there any way to make this filtering more efficient? Since querying all the documents is faster.
System Info
System Information
Package Information
Packages not installed (Not Necessarily a Problem)
The following packages were not found:
Beta Was this translation helpful? Give feedback.
All reactions