Retrieval latency from Pinecone: Retrieval vs SelfQueryRetrieval #12282

GiacomoFonseca · 2023-10-25T16:19:07Z

GiacomoFonseca
Oct 25, 2023

I'm testing the average time needed by Langchain to retrieve the relevant chunks from a pinecone index. I have chunked, embedded and loaded around 300 pdf files. Each chunk has a few metadata fields like the name of the document, the number, the year and so on.
For the basic retrieval case:

vectordb = Pinecone.from_existing_index(index_name, embedding)
retriever=vectordb.as_retriever()

start = time.time()
retriever.get_relevant_documents(query)
end = time.time()
t = end - start

the reply is quite fast and I always measure t < 0.5 seconds to get the chunks back.
When using the SelfQueryRetrieval instead (using text-davinci-003 as llm here):

retriever = SelfQueryRetriever.from_llm(llm,
                                        vectordb,
                                        document_content_description,
                                        metadata_field_info,
                                        verbose=True,
                                        enable_limit=True,
                                        ) 
start = time.time()
retriever.get_relevant_documents(query)
end = time.time()
t = end - start

and asking questions that trigger one or more filters from the metadata, I always get t > 3 seconds!

Do you find it's normal that the time is almost an order of magnitude higher for the self-retriever? Is it because the latter makes an actual OpenAI call to use the llm, and that takes more time?
Imagine you want to build a chatbot based on those documents and want to be able to filter the documents from the user query and answer questions only based on those (expecting it to be faster that way, not slower...) what would you suggest to speed up the process? Use other indexing methods maybe? Llama_index, Elasticsearch...?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Retrieval latency from Pinecone: Retrieval vs SelfQueryRetrieval #12282

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Retrieval latency from Pinecone: Retrieval vs SelfQueryRetrieval #12282

Uh oh!

Uh oh!

GiacomoFonseca Oct 25, 2023

Replies: 0 comments

GiacomoFonseca
Oct 25, 2023