How to Enable LLM to Utilize Multiple Documents in a Single Query #18498
Replies: 1 comment
-
Hello @danieltan-ifx! I'm Dosu, your friendly bot here to help you solve bugs, answer questions, and become a contributor while you wait for a human maintainer. Let's tackle your issue together! To enable the LLM to utilize multiple documents in a single query when using the from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from llama_index.packs.multidoc_autoretrieval.base import RecursiveRetriever, RetrieverQueryEngine
from llama_index.packs.multidoc_autoretrieval.base import MetadataFilters, MetadataFilter, FilterOperator
documents = SimpleDirectoryReader("data").load_data()
index = VectorStoreIndex.from_documents(documents)
# Setup per-document retriever
retriever_dict = {}
for doc in documents:
index_id = doc.metadata["index_id"]
filters = MetadataFilters(
filters=[
MetadataFilter(
key="index_id", operator=FilterOperator.EQ, value=index_id
),
]
)
retriever = index.as_retriever(filters=filters)
retriever_dict[index_id] = retriever
# Setup recursive retriever
recursive_retriever = RecursiveRetriever(
"vector",
retriever_dict={"vector": index.as_retriever(similarity_top_k=5), **retriever_dict},
verbose=True,
)
# Plug into query engine
query_engine = RetrieverQueryEngine.from_args(recursive_retriever, llm=OpenAI(model="gpt-3.5-turbo"))
# To get multiple response texts, you can use lower-level APIs to perform retrieval once and generate multiple responses:
from llama_index.core import get_response_synthesizer
retriever = index.as_retriever(similarity_top_k=5, verbose=True)
synth = get_response_synthesizer(response_mode="compact")
nodes = retriever.retrieve("What did the author do growing up?")
response_1 = synth.synthesize("What did the author do growing up?", nodes)
response_2 = synth.synthesize("What did the author do growing up?", nodes)
print(response_1)
print(response_2) This configuration allows the query engine to retrieve more context by considering multiple documents in the query process, utilizing both the To continue talking to Dosu, mention @dosu. Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Bug Report | Other |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
The chat engine can only use a single document in the retrieved index to answer the query.
index_2 = VectorStoreIndex(rerank_nodes,service_context = rag.service_context, show_progress=False)
custom_llm = rag._init_llm_model(system_prompt,query_wrapper_prompt, endpoint=config['SERVICE_HOST_LLM'], model_name='llama3.1-70b')
query_engine = index_2.as_chat_engine(llm=custom_llm, chat_mode="context", system_prompt =system_prompt )
response = query_engine.chat(modified_question)
Let's say if I ask a query that requires information from different documents, only one will be answered. The chatbot will respond that another information is not available. After checking the content of rerank_nodes, both information are present. How can I make the LLM to make use of all the documents provided?
Beta Was this translation helpful? Give feedback.
All reactions