A sophisticated conversational AI microservice with memory management using FastAPI, LangGraph, Redis, and MongoDB.
- Short-Term Memory: Last 3-4 messages stored in Redis, backed up to MongoDB on logout
- Slider Summary: Generated every 4th message, summarizes older conversations
- Long-Term Memory: 5 key points extracted every 8 messages, stored across conversations
- Chat History: Complete conversation history stored in both Redis (buffer) and MongoDB (persistent)
- Intelligent tool usage (db_search for memory, web_search for external info)
- Automatic memory management with LLM-generated summaries
- Session persistence across login/logout
- RESTful API with proper error handling
- Docker containerization for easy deployment
- Docker and Docker Compose installed
- Gemini API key from Google AI Studio
- Tavily API key from Tavily
-
Copy
.env.exampleto.env:cp .env.example .env
-
Edit
.envand add your API keys:GEMINI_API_KEY=your_gemini_api_key_here TAVILY_API_KEY=your_tavily_api_key_here
conversational-ai/
├── docker-compose.yml
├── Dockerfile
├── requirements.txt
├── .env.example
├── .env (create this)
├── test_service.py
└── app/
├── __init__.py (create empty file)
├── main.py
├── agent.py
├── memory_manager.py
└── database_init.py
-
Start all services:
docker-compose up --build
-
The service will be available at
http://localhost:8000 -
Check health:
curl http://localhost:8000
Run the test script:
python test_service.pyPOST /chat
Content-Type: application/json
{
"message": "Hello, how are you?",
"user_id": "user123",
"conversation_id": "conv456"
}POST /logout
Content-Type: application/json
{
"user_id": "user123",
"conversation_id": "conv456"
}POST /login
Content-Type: application/json
{
"user_id": "user123",
"conversation_id": "conv456"
}GET /memory/{user_id}/{conversation_id}GET /history/{user_id}?skip=0&limit=20-
Message Processing:
- User sends message via
/chatendpoint - Message is added to Redis and MongoDB
- LangGraph agent processes the message
- User sends message via
-
Intelligent Response:
- LLM decides if tools are needed based on the query
- Simple greetings get direct responses
- Complex queries trigger db_search or web_search tools
-
Memory Management:
- Every 4th message: Generate slider summary
- Every 8th message: Extract 5 key points for long-term memory
- On logout: Save short-term memory to MongoDB, clear Redis
- On login: Restore memory from MongoDB to Redis
-
"I apologize, but I'm having trouble generating a response"
- Check if API keys are properly set in
.env - Verify Redis and MongoDB are running
- Check logs:
docker-compose logs conversational_ai
- Check if API keys are properly set in
-
Connection Errors
- Ensure all containers are running:
docker-compose ps - Check if ports 8000, 6379, 27017 are not in use
- Ensure all containers are running:
-
Memory Not Persisting
- Verify MongoDB is running and accessible
- Check indexes are created properly
- Review logs for any database errors
To run locally without Docker:
-
Install dependencies:
pip install -r requirements.txt
-
Start Redis and MongoDB locally
-
Run the service:
cd app python -m uvicorn main:app --reload
- Redis: Use Redis CLI or GUI tools to inspect keys
- MongoDB: Use MongoDB Compass or shell to view collections
- Logs:
docker-compose logs -f conversational_ai
- Redis can be configured with persistence (AOF/RDB)
- MongoDB can be set up with replica sets for HA
- The service is stateless and can be horizontally scaled
- Consider implementing rate limiting for production use