how to get topk retrievals for ParentDocumentRetriever using Chroma? #17582
Replies: 3 comments 1 reply
-
🤖 Hey @nithinreddyyyyyy! Great to see you diving into another intriguing aspect of LangChain. How's everything going on your end? Based on the code you've provided, it seems like you're using the To achieve this, you can modify your code to use the from langchain.vectorstores import Chroma
from langchain.embeddings.openai import OpenAIEmbeddings
# Initialize the embeddings and vectorstore
embeddings = OpenAIEmbeddings()
vectorstore = Chroma("full_documents", embeddings)
# Run a similarity search with a query
query = "data related to cricket"
k = 5 # Number of documents to return
documents = vectorstore.similarity_search(query, k)
# Or run a similarity search with an embedding vector
embedding = [0.1, 0.2, 0.3, 0.4, 0.5] # Your embedding vector here
documents = vectorstore.similarity_search_by_vector(embedding, k) In this example, the Please note that this approach will return the top k documents based on the similarity to the query or embedding vector, not based on the parent-child document structure used by the You can find more information about the I hope this helps! If you have any further questions, feel free to ask. Sources
|
Beta Was this translation helpful? Give feedback.
-
It agree would be nice if the # First search vector store:
query = "data related to cricket"
k = 5
documents = vectorstore.similarity_search(query, k)
# Then retrieve parent documents:
keys = [doc.metadata["doc_id"] for doc in documents]
documents = store.mget(keys) It seems |
Beta Was this translation helpful? Give feedback.
-
retriever = ParentDocumentRetriever(
vectorstore=vectorstore,
docstore=store,
child_splitter=child_splitter,
parent_splitter=parent_splitter,
# Get top 5 documents
search_kwargs={"k": 5},
) |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Checked other resources
Commit to Help
Example Code
Description
in the above code, if you see i wrote a code where ParentDocument retriever will invoke. And it is returning only 1 document. How to get topk documents?
System Info
google colab
Beta Was this translation helpful? Give feedback.
All reactions