Retrieval from Azure Index (RAG Implementation) #25281
Replies: 1 comment 4 replies
-
To retrieve the top k chunks that can be used as context knowledge for your RAG implementation using Azure Index, you can follow these steps:
Here is the relevant code: # Load a document and split it into semantic chunks
loader = AzureAIDocumentIntelligenceLoader(
file_path="<path to your file>",
api_key=doc_intelligence_key,
api_endpoint=doc_intelligence_endpoint,
api_model="prebuilt-layout",
)
docs = loader.load()
# Split the document into chunks based on markdown headers
headers_to_split_on = [
("#", "Header 1"),
("##", "Header 2"),
("###", "Header 3"),
]
text_splitter = MarkdownHeaderTextSplitter(headers_to_split_on=headers_to_split_on)
docs_string = docs[0].page_content
splits = text_splitter.split_text(docs_string)
print("Length of splits: " + str(len(splits)))
# Embed the split documents and insert into Azure Search vector store
aoai_embeddings = AzureOpenAIEmbeddings(
azure_deployment="<Azure OpenAI embeddings model>",
openai_api_version="<Azure OpenAI API version>", # e.g., "2023-07-01-preview"
)
vector_store_address: str = os.getenv("AZURE_SEARCH_ENDPOINT")
vector_store_password: str = os.getenv("AZURE_SEARCH_ADMIN_KEY")
index_name: str = "<your index name>"
vector_store: AzureSearch = AzureSearch(
azure_search_endpoint=vector_store_address,
azure_search_key=vector_store_password,
index_name=index_name,
embedding_function=aoai_embeddings.embed_query,
)
vector_store.add_documents(documents=splits)
# Retrieve relevant chunks based on the question
retriever = vector_store.as_retriever(search_type="similarity", search_kwargs={"k": 3})
retrieved_docs = retriever.invoke("<your question>")
print(retrieved_docs[0].page_content) This code demonstrates how to load a document, split it into chunks, embed and index those chunks, and finally retrieve the top k relevant chunks based on a user query [1]. Additionally, you can enhance your retrieval process by incorporating query analysis techniques. This involves defining a query schema and using a function-calling model to convert a user question into structured queries, which can improve the relevance of the retrieved chunks. For example: from typing import Optional
from langchain_core.pydantic_v1 import BaseModel, Field
class Search(BaseModel):
query: str = Field(..., description="Similarity search query applied to video transcripts.")
publish_year: Optional[int] = Field(None, description="Year video was published")
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough
from langchain_openai import ChatOpenAI
system = """You are an expert at converting user questions into database queries. \
You have access to a database of tutorial videos about a software library for building LLM-powered applications. \
Given a question, return a list of database queries optimized to retrieve the most relevant results.
If there are acronyms or words you are not familiar with, do not try to rephrase them."""
prompt = ChatPromptTemplate.from_messages(
[
("system", system),
("human", "{question}"),
]
)
llm = ChatOpenAI(model="gpt-3.5-turbo-0125", temperature=0)
structured_llm = llm.with_structured_output(Search)
query_analyzer = {"question": RunnablePassthrough()} | prompt | structured_llm
def retrieval(search: Search) -> List[Document]:
if search.publish_year is not None:
_filter = {"publish_year": {"$eq": search.publish_year}}
else:
_filter = None
return vectorstore.similarity_search(search.query, filter=_filter)
retrieval_chain = query_analyzer | retrieval
results = retrieval_chain.invoke("RAG tutorial published in 2023") This approach ensures that your retrieval process is more robust and can handle specific query requirements effectively [2]. By leveraging these methods, you can efficiently manage the retrieval process and ensure that the most relevant chunks are used as context knowledge for your RAG implementation. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Checked other resources
Commit to Help
Example Code
Description
Hi All,
I have been working on integrating Azure with OpenAI, specifically focusing on using Azure Index as a large-scale vector database.
My Use Case:
Now, I have stored each chunk with its corresponding embedding in the index.
In the retrieval step:
Vector_store
function from LangChain. While it provides decent results for single-field queries, the results for multi-field queries are unsatisfactory.The large size of the documents prevents me from directly uploading them and their embeddings since it exceeds the context limit.
My Questions:
Thank you!
System Info
Windows
langchain=0.1.20
langchain-community=0.0.38
langchain-core=0.1.52
langchain-experimental=0.0.58
langchain-openai=0.1.6
langchain-text-splitters=0.0.2
azure-identity=1.17.1
azure-search-documents=11.4.0
openai
Beta Was this translation helpful? Give feedback.
All reactions