Optimize LLM responses using search algorithms - A production-ready FastAPI backend and MCP server that leverages Monte Carlo Tree Search (MCTS) to intelligently explore and evaluate conversation paths for enhanced response quality.
- Quick Start (Docker Compose)
- Overview
- Architecture
- Service Modes & Usage
- Claude Desktop Integration
- Manual Setup
- Development
- Testing
- Contributing
- Roadmap
- License
Recommended Installation - Automatically provisions Redis, PostgreSQL/PGVector, and Prometheus:
- Docker โฅ 24.0
- Docker Compose v2
git clone https://github.com/yourusername/ConversationalAnalysisEngine
cd ConversationalAnalysisEngine
docker compose up --build
- API Backend:
http://localhost:8000
(health checks, conditional metrics) - MCP Server:
http://localhost:8001/mcp/v1
(conversation analysis) - Redis Cache:
localhost:6379
- PostgreSQL/PGVector:
localhost:5432
docker compose --profile monitoring up --build
- Prometheus Metrics:
http://localhost:9090
- Grafana Dashboard:
http://localhost:3000
(admin/admin)
docker compose down -v
The Conversational Analysis Engine (CAE) enhances LLM response optimization by applying advanced search algorithms to conversation paths. Instead of generating single responses, CAE uses Monte Carlo Tree Search (MCTS) to:
- Generate multiple response branches for any conversation context
- Simulate conversation continuations to predict outcomes
- Score paths based on goal-specific metrics (emotional intelligence, persuasiveness, helpfulness)
- Select the optimal response through intelligent exploration
Dual Architecture:
- FastAPI Backend: Production-ready API with health checks, metrics, and monitoring
- MCP Server: Monte Carlo Tree Search optimization via Model Context Protocol
[Claude Desktop/Code] โ [MCP Server :8001] โ [MCTS Algorithm] โ [Redis Cache]
[MCP Clients] โ โ
[Response Generator]
[Health Checks] โ [API Backend :8000] โ metrics โ [Prometheus :9090]
[Monitoring] โ โ
[PostgreSQL/PGVector :5432]
Service Architecture:
- MCP Server (Port 8001): Conversation analysis via MCTS algorithm
- API Backend (Port 8000): Health checks, conditional metrics endpoints
- Redis (Port 6379): Required - Conversation storage and caching
- PostgreSQL/PGVector (Port 5432): Required - Conversation storage
- Prometheus (Port 9090): Optional - Metrics collection (monitoring profile)
- Grafana (Port 3000): Optional - Metrics dashboard (monitoring profile)
- MCTS Algorithm (
app/services/mcts/
): Monte Carlo Tree Search implementation with UCB1 exploration - Response Generator: Creates diverse response branches using LLM variations
- Conversation Simulator: Predicts user reactions and conversation continuations
- Conversation Scorer: Evaluates path quality based on customizable metrics
- Semantic Cache: Redis-based caching with embedding similarity for performance optimization
- Metrics Collection: Prometheus metrics for production monitoring
Docker (Recommended):
docker compose up mcp
Local Development:
poetry run python servers/mcp/mcts_analysis_server.py --transport http --port 8001
Features:
- โ MCTS-Powered Conversation Analysis: Multi-branch exploration with intelligent search
- โ Goal-Oriented Optimization: Customize for empathy, persuasion, problem-solving
- โ Configurable Parameters: Branch count, simulation depth, exploration constants
- โ Real-time Processing: Efficient async processing with resource management
Docker:
docker compose up api
Local Development:
poetry run python -m app.main
Features:
- โ
Health checks at
GET /health
- โ
Conditional Prometheus metrics at
GET /metrics
(when enabled) - โ Service monitoring and logging
WARNING: The POST /api/v1/analyze
endpoint is deprecated and returns HTTP 410.
Migration Path: Use the MCP server for all conversation analysis:
# โ Deprecated - DO NOT USE
response = httpx.post("http://localhost:8000/api/v1/analyze", ...)
# โ
Use MCP Server instead
from mcp import Client
client = Client("http://localhost:8001/mcp/v1")
result = await client.call_tool("analyze_conversation", ...)
The server exposes the analyze_conversation
tool with the following signature:
// TypeScript/JavaScript MCP Client Example
import { Client } from '@modelcontextprotocol/sdk/client/index.js';
const client = new Client({
name: "cae-client",
version: "1.0.0"
});
const result = await client.callTool("analyze_conversation", {
conversation_goal: "help user feel better about their situation",
messages: [
{role: "user", content: "I failed my exam and feel terrible"},
{role: "assistant", content: "I'm sorry to hear about your exam."}
],
num_branches: 3,
simulation_depth: 2,
mcts_iterations: 10
});
console.log("Optimized response:", result.selected_response);
console.log("Analysis:", result.analysis);
To use CAE with Claude Desktop, add the MCP server to your configuration:
-
Open Claude Desktop configuration:
- macOS:
~/Library/Application Support/Claude/claude_desktop_config.json
- Windows:
%APPDATA%/Claude/claude_desktop_config.json
- macOS:
-
Add CAE MCP server configuration:
{
"mcpServers": {
"conversational-analysis-engine": {
"command": "docker",
"args": [
"compose", "-f", "/path/to/ConversationalAnalysisEngine/docker-compose.yml",
"up", "mcp", "--build"
],
"env": {
"LLM_API_KEY": "your_openai_api_key"
}
}
}
}
-
Restart Claude Desktop to load the MCP server
-
Use in conversations:
I need to respond to a difficult customer complaint. Can you use the MCTS analysis to help me find the best response?
Goal: Maintain customer relationship while addressing concerns
Current conversation: [customer complaint details]
For Claude Code users, configure the MCP server in your settings:
{
"mcp": {
"servers": {
"cae": {
"command": "docker",
"args": [
"compose", "-f", "/path/to/ConversationalAnalysisEngine/docker-compose.yml",
"up", "mcp", "--build"
],
"env": {
"LLM_API_KEY": "your_openai_api_key"
}
}
}
}
}
For advanced users who prefer manual installation:
- Python 3.12+
- Poetry (package manager)
- Redis (required for caching)
- PostgreSQL with PGVector (required for conversation storage)
# Clone the repository
git clone https://github.com/yourusername/ConversationalAnalysisEngine
cd ConversationalAnalysisEngine
# Install dependencies with Poetry
poetry install
# Set up environment variables
cp .env.example .env
# Edit .env with your configuration
Minimal Setup - Only one environment variable required:
# REQUIRED: LLM Configuration
LLM_API_KEY=your_openai_api_key
Full Setup - All optional configuration with smart defaults:
# REQUIRED: LLM Configuration
LLM_API_KEY=your_openai_api_key
# LLM Configuration (optional - smart defaults)
LLM_API_BASE_URL=https://api.openai.com/v1 # Default
LLM_MODEL_NAME=o3-mini # Default
# OPTIONAL: Embedding Configuration (enables semantic caching when present)
EMBEDDING_MODEL_API_KEY=your_openai_api_key # Optional
EMBEDDING_MODEL_BASE_URL=https://api.openai.com/v1 # Default
EMBEDDING_MODEL_NAME=text-embedding-3-large # Default
# Feature Toggles (optional)
DISABLE_PROMETHEUS_METRICS=false # Default: metrics enabled
# Database Configuration (Docker Compose defaults)
DB_HOST=postgres # Default for Docker
DB_PORT=5432 # Default
DB_NAME=conversation_analysis # Default
DB_USER=cae_user # Default
DB_SECRET=cae_password # Default
# Redis Configuration (Docker Compose defaults)
REDIS_HOST=redis # Default for Docker
REDIS_PORT=6379 # Default
# Application Settings (optional)
LOG_LEVEL=INFO # Default
LLM_TIMEOUT_SECONDS=600 # Default
Alternative Providers (e.g., OpenRouter, Groq):
# OpenRouter Example
LLM_API_KEY=your_openrouter_api_key
LLM_API_BASE_URL=https://openrouter.ai/api/v1
LLM_MODEL_NAME=anthropic/claude-3-sonnet
# Groq Example
LLM_API_KEY=your_groq_api_key
LLM_API_BASE_URL=https://api.groq.com/openai/v1
LLM_MODEL_NAME=llama-3.1-8b-instant
# For semantic caching with different embedding provider
EMBEDDING_MODEL_API_KEY=your_embedding_provider_key
EMBEDDING_MODEL_BASE_URL=https://api.your-provider.com/v1
EMBEDDING_MODEL_NAME=your-embedding-model
# Start only infrastructure with Docker Compose
docker compose -f compose.infrastructure.yml up
# Or start services manually:
redis-server
# Configure PostgreSQL with PGVector extension
Customize search behavior through configuration:
# High-quality, slower analysis
config = {
"num_branches": 8, # More initial branches
"mcts_iterations": 20, # More iterations
"simulation_depth": 4, # Deeper simulations
"exploration_constant": 1.0 # Balanced exploration
}
# Fast, real-time analysis
config = {
"num_branches": 3,
"mcts_iterations": 5,
"simulation_depth": 2,
"exploration_constant": 2.0 # More exploration
}
This project maintains high code quality standards:
# Linting and formatting
poetry run ruff format .
poetry run ruff check .
# Type checking
poetry run mypy app/
# Run all quality checks
make quality-check
# Start API backend with hot reload
poetry run uvicorn app.main:app --reload --port 8000
# Start MCP server with debug logging
poetry run python servers/mcp/mcts_analysis_server.py --log-level DEBUG
# Start both with Docker Compose
docker compose up --build
Comprehensive test suite with 95%+ coverage requirement:
# Run all tests
poetry run pytest
# With coverage report
poetry run pytest --cov=app --cov-report=html --cov-report=term
# Run specific test categories
poetry run pytest tests/unit/ # Unit tests
poetry run pytest tests/integration/ # Integration tests
poetry run pytest tests/e2e/ # End-to-end tests
# Performance tests
poetry run pytest tests/performance/ -v
Test Structure:
- Unit Tests: Individual component testing with mocks
- Integration Tests: Service interaction testing
- E2E Tests: Full workflow testing via API/MCP
- Performance Tests: Load and latency testing
Open-source contributions are welcome! Please follow these guidelines:
- Fork the repository on GitHub
- Clone your fork:
git clone https://github.com/yourusername/ConversationalAnalysisEngine
- Install dependencies:
poetry install
- Create a feature branch using the naming convention:
# Branch naming format: feature/<feature-abbrev>-<issue-num>-<tag-line>
git checkout -b feature/CRITICAL-1-CORS-fix
git checkout -b feature/PERF-23-redis-optimization
git checkout -b feature/MCTS-45-branching-strategy
- Linter: Use
ruff
for code formatting and linting - Test Coverage: Maintain โฅ95% test coverage for all new code
- Type Hints: All functions must have proper type annotations
- Documentation: Update docstrings and README for new features
- Status Checks: Ensure all CI checks pass (tests, linting, coverage)
- PR Approval: At least one approval required from maintainer
- Branch Protection: Feature branches must be up-to-date with main
- Documentation: Update relevant documentation for new features
# Good commit messages
git commit -m "feat: add semantic caching for MCTS nodes"
git commit -m "fix: handle timeout errors in conversation simulation"
git commit -m "docs: update API examples in README"
- ๐ง Unified LLM Evaluation: Combine multiple LLM calls into single requests (66% cost reduction)
- ๐ง Advanced Semantic Caching: Embedding-based cache with similarity detection
- ๐ง Resource-Aware Scheduling: Dynamic resource allocation and request prioritization
- ๐ฎ Domain-Agnostic MCTS: Generalized search framework for various optimization tasks beyond conversation
- ๐ฎ Alternative Search Algorithms: Beam search, A* with heuristics, hybrid approaches
- ๐ฎ Multi-Objective Optimization: Simultaneous optimization for multiple conversation goals
- ๐ฎ Reinforcement Learning Integration: Learning-based path selection improvements
- ๐ฎ Distributed Processing: Horizontal scaling with work queues
- ๐ฎ Model Cascading: Use smaller models for simulation, larger for final generation
- ๐ฎ Advanced Analytics: Conversation pattern analysis and success prediction
This project is licensed under the MIT License.
- FastMCP for excellent MCP server framework
- OpenAI for OpenAI API SPec
- Anthropic for MCP specification and Claude integration
- Documentation: Detailed guides in
/docs
- Issues: Report bugs and feature requests via GitHub Issues
Built by Manav Pandey for the AI community - enabling smarter conversations through algorithmic optimization.