AI agent system that automates internal document search and deep research for enterprises
Deep Research Agents is a next-generation MultiAgent system built on Semantic Kernel. Through MagenticOrchestration, multiple specialized AI agents dynamically collaborate to automatically generate high-quality research reports from enterprise internal documents. From internal document search via Azure AI Search, Semantic Kernel Memory, to comprehensive reliability assessment, it intelligently automates the entire enterprise research process.
- π€ Magentic Multi-Agent Orchestration: Latest orchestration technology from Semantic Kernel
- π Advanced Internal Document Search: Azure AI Search + Semantic Kernel Memory integration
- π Web Search Integration: Enhanced research capabilities with external web search fallback
- π§ Contextual Memory Management: Persistent research context and knowledge integration via Semantic Kernel Memory
- π‘οΈ AI Reliability Assessment: Multi-layered Confidence evaluation and source quality management
- π Structured Report Generation: Evidence-based reports with citation management
- π Multilingual Intelligence: Professional terminology translation system
- β‘ Dynamic Quality Management: Real-time quality assessment and self-improvement loops
The Deep Research Agent features an intuitive web interface that provides real-time research capabilities with professional styling and comprehensive result formatting.
Interface Features:
- Clean, Professional Design: Modern gradient interface with glass-morphism effects
- Real-time Progress Tracking: Live logs and status updates during research execution
- Structured Results Display: Well-formatted research reports with proper headings and citations
- Tabbed Interface: Separate views for results and live execution logs
- Responsive Layout: Works seamlessly across desktop and mobile devices
- Interactive Research: Simple query input with instant research initiation
Deep Research Agents is a next-generation MultiAgent system centered on Microsoft Semantic Kernel and MagenticOrchestration. It fully automates internal document search, analysis, and report generation specialized for enterprise R&D.
βββββββββββββββββββββββββββββββββββ
β MagenticOrchestration β
β (StandardMagenticManager) β
β + R&D Logic Engine β
ββββββββββββ¬βββββββββββββββββββββββ
β Dynamic Coordination
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Specialized Agent System β
β (Semantic Kernel Agents) β
β β
β βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ β
β β LeadResearcher β β Credibility β β Summarizer β β
β β Agent β β Critic Agent β β Agent β β
β β βββββββββββββββ β β (Source Quality β β (Knowledge β β
β β βRESEARCHER1 β β β Assessment) β β Synthesis) β β
β β βRESEARCHER2 β β β β β β β
β β βRESEARCHER3+ β β β β β β β
β β βββββββββββββββ β β β β β β
β βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ β
β β
β βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ β
β β ReportWriter β β Reflection β β Translator β β
β β Agent β β Critic Agent β β Agent β β
β β (Confidence + β β (Quality β β (Professional β β
β β Citations) β β Validation) β β Translation) β β
β βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ β
β β
β βββββββββββββββββββ β
β β Citation β β
β β Agent β β
β β (Reference β β
β β Management) β β
β βββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Plugin & Infrastructure Layer β
β β
β βββββββββββββββββββ βββββββββββββββββββ β
β β Memory β β ModularSearch β β
β β Plugin β β Plugin β β
β β (Research β β (Azure AI β β
β β Context) β β Search) β β
β βββββββββββββββββββ βββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
ModularSearchPlugin provides comprehensive search capabilities:
An Azure AI Search integration system dynamically configured based on project settings (config/project_config.yaml). It provides unified interface search functionality across multiple indexes defined in configuration files.
The search system flexibly adapts to enterprise-specific document structures and index configurations, achieving high-precision information retrieval through a combination of Vector search, Semantic search, and Full-text search.
Enhanced Research Capabilities: The system now includes web search functionality as an additional information source and fallback mechanism:
- Primary Search: Internal document search via Azure AI Search
- Web Search Fallback: When internal sources are insufficient, the system automatically falls back to web search
- Configurable Integration: Web search can be enabled/disabled and configured via
project_config.yaml - Quality Assurance: Web search results undergo the same reliability assessment as internal documents
- Role: Manager and coordinator of multiple internal sub-ResearchAgents
- Architecture: Contains and orchestrates 3+ sub-ResearchAgents (RESEARCHER1, RESEARCHER2, RESEARCHER3...)
- Special Capability: Parallel orchestration and concurrent execution of multiple research queries
- Implementation: Internal agent management via
ConcurrentOrchestrationandParallelResearchPlugin - Functions:
- Distributes research queries across sub-ResearchAgents
- Aggregates and synthesizes results from multiple agents
- Quality management and result integration
- Dynamic agent scaling based on workload
- Memory: Context continuation through Semantic Kernel Memory integration shared across all sub-agents
- Role: Scientific evaluation of internal source reliability and coverage
- Evaluation Criteria: Source quality, information consistency, evidence strength
- Functions: Supplementation through additional searches, reliability score calculation
- Output: Structured reliability reports + improvement recommendations
- Role: Structured summarization of large volumes of internal documents
- Specialization: Classification by enterprise themes, prioritization
- Technology: Hierarchical summarization, keyword extraction, relevance analysis
- Output: Structured summaries + key point extraction
- Role: Final report creation and Confidence score assignment
- Technology: Structured document generation, citation management, evidence demonstration
- Evaluation: Multi-axis Confidence evaluation (source quality, consistency, comprehensiveness)
- Output: Decision support reports + reliability indicators
- Role: Validation of report quality and Confidence evaluation validity
- Technology: Meta-cognitive evaluation, logical consistency checks, improvement recommendations
- Standards: Compliance with enterprise R&D quality standards
- Output: Quality evaluation reports + improvement guidance
- Role: High-precision translation with specialized terminology support
- Specialization: Technical document format preservation, specialized terminology dictionary
- Functions: Bidirectional Japanese-English translation, context-aware translation
- Quality: Translation quality evaluation, terminology standardization
- Role: Internal document citation and reference management
- Technology: Automated citation generation, source traceability
- Verification: Citation accuracy, source existence confirmation
- Output: Structured citation lists + metadata
Before using the Deep Research Agents, ensure you have the following:
- Python 3.12+ (Recommended: 3.12.10 or later)
- Azure OpenAI account with access to:
- GPT-4.1, GPT-4.1-mini, o3 or equivalent models
- Text embedding models (text-embedding-3-small, text-embedding-3-large, etc.)
- Azure AI Search service with:
- Semantic search configuration enabled
- Vector search capabilities
- Existing search indexes with your enterprise documents
- Web Search API (optional, for web search functionality):
- Tavily API key
- Currently, this repo only supports Tavily, please implement search providers if you want to use other search engines
Follow these step-by-step instructions to set up the Deep Research Agents:
git clone <repository-url>
cd <directory-name># Create virtual environment
python -m venv deepresearchagent
# Activate virtual environment
.\deepresearchagent\Scripts\Activate.ps1
# Verify activation (should show the virtual environment path)
where pythonpip install -r requirements.txt4.1 Create Environment Variables File
# Copy template and create your .env file
Copy-Item .env.example .envPlease update based on your settings
4.2 Create Project Configuration File
# Copy template and create your project configuration
Copy-Item config\project_config_templates.yaml config\project_config.yaml
Please update project configuration with your specific settings
Required configurations in config/project_config.yaml:
- Company information (system.company)
- Azure AI Search index configurations (data_sources.document_types)
- Web search settings (data_sources.web_search)
- Agent behavior parameters (agents)
# Start the research agent
python main.py --query "Could you summarize the latest update on Azure OpenAI in 2025?"The system uses template files that you need to copy and customize:
Deep-Research-Agents/
βββ config/
β βββ project_config_templates.yaml # Template for project configuration
β βββ project_config.yaml # Your customized configuration (create this)
βββ .env.example # Template for environment variables
βββ .env # Your environment variables (create this)
Before deploying to Azure, ensure you have:
- Azure CLI installed and configured (
az --version) - Azure Developer CLI installed (
azd version) - Active Azure subscription with appropriate permissions
- Python 3.12 runtime support in Azure App Service
# Install Azure CLI (if not already installed)
# Download from: https://docs.microsoft.com/en-us/cli/azure/install-azure-cli
# Install Azure Developer CLI
# Download from: https://docs.microsoft.com/en-us/azure/developer/azure-developer-cli/install-azd
# Verify installations
az --version
azd version# Login to Azure CLI
az login
# Login to Azure Developer CLI
azd auth login
# Set your subscription (if you have multiple)
az account set --subscription "YOUR-SUBSCRIPTION-ID"Ensure your .env file contains all required Azure service endpoints:
AZURE_OPENAI_ENDPOINT=https://your-openai.openai.azure.com/
AZURE_OPENAI_API_KEY=your-api-key
AZURE_OPENAI_DEPLOYMENT_NAME=gpt-4o
AZURE_OPENAI_API_VERSION=2024-02-15-preview
TAVILY_API_KEY=your-tavily-key
AZURE_SEARCH_ENDPOINT=https://your-search.search.windows.net
AZURE_SEARCH_KEY=your-search-key# Initialize the Azure project (first time only)
azd init
# Follow the prompts to:
# - Select Python as the language
# - Choose App Service as the hosting option
# - Set environment name (e.g., 'production', 'dev')
# - Select Azure region (e.g., 'East US', 'West Europe')# Deploy infrastructure and application
azd up
# This command will:
# 1. Provision Azure resources (App Service, Resource Group)
# 2. Build the application
# 3. Deploy to Azure App Service
# 4. Configure environment variablesFor quick application-only deployments (recommended for code changes):
# Deploy only application code (fastest option)
azd deploy --no-prompt
# This command will:
# 1. Package your application code
# 2. Upload to Azure App Service
# 3. Restart the application
# 4. Preserve existing infrastructure and settingsFor full infrastructure and application deployment:
# Full deployment with infrastructure updates
azd up --no-prompt
# Use this when you've changed:
# - Azure resource configurations
# - Environment variables
# - Infrastructure settingsAfter deployment, test your application:
# Get the deployment URL
azd show
# Test the health endpoint
curl "https://[your-app-name].azurewebsites.net/health"
# Test the main interface
# Open https://[your-app-name].azurewebsites.net/ in your browserThe repository includes a GitHub Actions workflow (.github/workflows/azure-dev.yml) for automated deployments. This is optional but recommended for:
- Automatic deployments when code is pushed to main branch
- Team collaboration with consistent deployment process
- Continuous integration with automated testing
If you want automated deployments, configure these GitHub repository secrets:
-
Repository Variables (Settings β Secrets and variables β Actions β Variables):
AZURE_CLIENT_ID: [your-service-principal-client-id] AZURE_TENANT_ID: [your-azure-tenant-id] AZURE_SUBSCRIPTION_ID: [your-azure-subscription-id] AZURE_ENV_NAME: [your-azd-environment-name] AZURE_LOCATION: [your-azure-region] -
Repository Secrets (Settings β Secrets and variables β Actions β Secrets):
AZURE_OPENAI_API_KEY: [your-azure-openai-key] AZURE_SEARCH_API_KEY: [your-azure-search-key] TAVILY_API_KEY: [your-tavily-api-key]
Manual Deployment (Current Method):
azd deploy --no-prompt # Fast, immediate controlAutomated Deployment (GitHub Actions):
- Triggers automatically on git push to main branch
- Includes automated health checks
- Consistent deployment process
- Requires initial setup of secrets/variables
Recommendation: Start with manual deployment, add GitHub Actions later for team environments.
If you prefer manual deployment only, you can remove the workflow:
# Remove the GitHub Actions workflow
Remove-Item -Path ".github\workflows\azure-dev.yml" -Force
# Or keep it for future use (recommended)
# The workflow won't run unless you configure the required secrets1. Deployment Conflicts (409 Error)
# Stop all conflicting processes
Get-Process | Where-Object {$_.ProcessName -like "*azd*" -or $_.ProcessName -like "*azure*"} | Stop-Process -Force -ErrorAction SilentlyContinue
# Restart the web app
az webapp restart --name [YOUR-APP-NAME] --resource-group [YOUR-RG-NAME]
# Retry deployment
azd deploy --no-prompt2. Python Runtime Issues
- Ensure
startup.txtcontains:gunicorn app:app -w 1 -k uvicorn.workers.UvicornWorker --bind=0.0.0.0:$PORT --timeout 300 - Verify Python 3.12 is specified in Azure configuration
- Check that all dependencies are in
requirements.txt
3. Environment Variable Issues
# Set environment variables manually if needed
az webapp config appsettings set --resource-group [RG-NAME] --name [APP-NAME] --settings AZURE_OPENAI_ENDPOINT="https://your-endpoint.openai.azure.com/"
# Check current settings
az webapp config appsettings list --resource-group [RG-NAME] --name [APP-NAME]4. Application Logs
# Stream live application logs
az webapp log tail --name [APP-NAME] --resource-group [RG-NAME]
# Download log files
az webapp log download --resource-group [RG-NAME] --name [APP-NAME]- All environment variables configured
- Azure OpenAI endpoints accessible
- Azure AI Search indexes populated
- Web search API keys valid (if using web search)
- Application successfully starts
- Health check endpoint responds
- Research functionality tested end-to-end
After successful deployment, your application will be available at:
https://[your-app-name].azurewebsites.net/
The deployment process will display the URL in the terminal output.
