This template helps you build Retrieval-Augmented Generation (RAG) AI applications using Azure OpenAI and Azure AI Search. Follow the steps below to understand RAG, set up Azure, and run the code or API.
RAG combines information retrieval and AI generation. When a user asks a question, the system:
- Retrieves relevant documents from a large data source (like PDFs or images in Azure Blob Storage) using Azure Cognitive Search and vector embeddings.
- Uses Azure OpenAI (e.g., GPT-4) to generate a response based on both the user’s question and the retrieved documents.
This approach gives more accurate, context-aware answers by grounding AI responses in your own data.
How it works:
- User / Application: The user sends a message (text or image) to the application.
- OpenAI gpt-4o-mini: The message is processed by the GPT-4o-mini model, which can use context from the vector data (retrieved documents) via AI Search.
- AI Search with Vector Index: The system searches for relevant content using a vector index built from your documents. This index is created using embeddings from the OpenAI text-embedding-ada-002 model.
- OpenAI text-embedding-ada-002: This model generates vector embeddings for your documents, making them searchable by semantic meaning.
- Blob Storage: All your source documents (PDFs, images, Excel files, etc.) are stored here. The embedding model processes these files to create the vector index.
The flow: User → GPT-4o-mini → (AI Search) → Vector Index (built from Blob Storage via embeddings) → Response to User
You’ll need:
- An Azure subscription
- Azure OpenAI resource (with GPT-4o-mini and text-embedding-ada-002 models)
- Azure Blob Storage (for your documents)
- Azure Cognitive Search (for vector search)
Step-by-step setup:
-
Create Azure OpenAI Resource
- In the Azure portal, create an OpenAI resource in the
East US
region. - Deploy the
gpt-4o-mini
model (supports text and image input). - Deploy the
text-embedding-ada-002
model for vector embeddings.
- In the Azure portal, create an OpenAI resource in the
-
Create Storage Account
- Create a Storage Account (Standard, LRS) in the same region.
- In the storage account, create a container and upload your data (e.g., brochures.zip from this link).
-
Create Azure Cognitive Search Resource
- Create a Cognitive Search resource (Free tier is fine) in the same region.
- In the Cognitive Search portal, use “Import and vectorize data” to connect to your Blob Storage and set up a vector index using the
text-embedding-ada-002
model.
-
Configure Keys and Endpoints
- In the Azure portal, copy the following values:
- OpenAI:
KEY1
,Endpoint
, and deployment name - Cognitive Search:
Url
,Primary admin key
, and index name
- OpenAI:
- Paste these into the
appsettings.json
file in bothapi/
andsimple-code/
folders:AzureOAIKey
,AzureOAIEndpoint
,AzureOAIDeploymentName
AzureSearchEndpoint
,AzureSearchKey
,AzureSearchIndex
- In the Azure portal, copy the following values:
The simple-code/
folder contains a minimal example (RAGAI.cs
) showing how to use Azure OpenAI and Cognitive Search together.
Steps:
- Open a terminal and navigate to the folder:
cd simple-code
- Restore dependencies:
dotnet restore
- Build the project:
dotnet build
- Run the example:
dotnet run
The api/
folder contains a REST API with endpoints for chat and RAG operations.
Steps:
- Navigate to the API folder:
cd api
- Restore dependencies:
dotnet restore
- Build the API:
dotnet build
- Run the API:
dotnet run
- Access the API documentation at:
Main Endpoints:
POST /api/v1/AzureOpenAI/chat
— Text-based RAG chatPOST /api/v1/AzureOpenAI/chat-with-image
— Image-based RAG chat
Unit and integration tests are in the api.Tests/
folder.
Steps:
- Navigate to the test folder:
cd api.Tests
- Run the tests:
dotnet test
- SOLID architecture for maintainability
- Rate limiting and validation
- In-memory mode for local development
- Swagger UI for API exploration
For more details, see the code comments and the architecture diagram in image/azure-rag-ai-diagram.png
.