How to Enable LLM to Utilize Multiple Documents in a Single Query #18498
Replies: 1 comment
-
Hello @danieltan-ifx! I'm Dosu, your friendly bot here to help you solve bugs, answer questions, and become a contributor while you wait for a human maintainer. Let's tackle your issue together! To enable the LLM to utilize multiple documents in a single query when using the from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from llama_index.packs.multidoc_autoretrieval.base import RecursiveRetriever, RetrieverQueryEngine
from llama_index.packs.multidoc_autoretrieval.base import MetadataFilters, MetadataFilter, FilterOperator
documents = SimpleDirectoryReader("data").load_data()
index = VectorStoreIndex.from_documents(documents)
# Setup per-document retriever
retriever_dict = {}
for doc in documents:
index_id = doc.metadata["index_id"]
filters = MetadataFilters(
filters=[
MetadataFilter(
key="index_id", operator=FilterOperator.EQ, value=index_id
),
]
)
retriever = index.as_retriever(filters=filters)
retriever_dict[index_id] = retriever
# Setup recursive retriever
recursive_retriever = RecursiveRetriever(
"vector",
retriever_dict={"vector": index.as_retriever(similarity_top_k=5), **retriever_dict},
verbose=True,
)
# Plug into query engine
query_engine = RetrieverQueryEngine.from_args(recursive_retriever, llm=OpenAI(model="gpt-3.5-turbo"))
# To get multiple response texts, you can use lower-level APIs to perform retrieval once and generate multiple responses:
from llama_index.core import get_response_synthesizer
retriever = index.as_retriever(similarity_top_k=5, verbose=True)
synth = get_response_synthesizer(response_mode="compact")
nodes = retriever.retrieve("What did the author do growing up?")
response_1 = synth.synthesize("What did the author do growing up?", nodes)
response_2 = synth.synthesize("What did the author do growing up?", nodes)
print(response_1)
print(response_2) This configuration allows the query engine to retrieve more context by considering multiple documents in the query process, utilizing both the To continue talking to Dosu, mention @dosu. Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Bug Report | Other |
Beta Was this translation helpful? Give feedback.
-
The chat engine can only use a single document in the retrieved index to answer the query.
index_2 = VectorStoreIndex(rerank_nodes,service_context = rag.service_context, show_progress=False)
custom_llm = rag._init_llm_model(system_prompt,query_wrapper_prompt, endpoint=config['SERVICE_HOST_LLM'], model_name='llama3.1-70b')
query_engine = index_2.as_chat_engine(llm=custom_llm, chat_mode="context", system_prompt =system_prompt )
response = query_engine.chat(modified_question)
Let's say if I ask a query that requires information from different documents, only one will be answered. The chatbot will respond that another information is not available. After checking the content of rerank_nodes, both information are present. How can I make the LLM to make use of all the documents provided?
Beta Was this translation helpful? Give feedback.
All reactions