A Model Context Protocol (MCP) server for logging and retrieving memories from LLM conversations with intelligent context window caching capabilities.
- Save Memories: Store memories from LLM conversations with timestamps and LLM identification
- Retrieve Memories: Get all stored memories with detailed metadata
- Add Memories: Append new memories without overwriting existing ones
- Clear Memories: Remove all stored memories
- Context Window Caching: Archive, retrieve, and summarize conversation context
- Relevance Scoring: Automatically score archived content relevance to current context
- Tag-based Search: Categorize and search context by tags
- Conversation Orchestration: External system to manage context window caching
- MongoDB Storage: Persistent storage using MongoDB database
- Install dependencies:
npm install
- Build the project:
npm run build
Set the MongoDB connection string via environment variable:
export MONGODB_URI="mongodb://localhost:27017"
Default: mongodb://localhost:27017
Start the MCP server:
npm start
Try the interactive CLI demo:
npm run cli
The CLI demo allows you to:
- Add messages to simulate conversation
- See automatic archiving when context gets full
- Trigger manual archiving and retrieval
- Create summaries of archived content
- Monitor conversation status and get recommendations
-
save-memories: Save all memories to the database, overwriting existing ones
memories
: Array of memory strings to savellm
: Name of the LLM (e.g., 'chatgpt', 'claude')userId
: Optional user identifier
-
get-memories: Retrieve all memories from the database
- No parameters required
-
add-memories: Add new memories to the database without overwriting existing ones
memories
: Array of memory strings to addllm
: Name of the LLM (e.g., 'chatgpt', 'claude')userId
: Optional user identifier
-
clear-memories: Clear all memories from the database
- No parameters required
-
archive-context: Archive context messages for a conversation with tags and metadata
conversationId
: Unique identifier for the conversationcontextMessages
: Array of context messages to archivetags
: Tags for categorizing the archived contentllm
: Name of the LLM (e.g., 'chatgpt', 'claude')userId
: Optional user identifier
-
retrieve-context: Retrieve relevant archived context for a conversation
conversationId
: Unique identifier for the conversationtags
: Optional tags to filter byminRelevanceScore
: Minimum relevance score (0-1, default: 0.1)limit
: Maximum number of items to return (default: 10)
-
score-relevance: Score the relevance of archived context against current conversation context
conversationId
: Unique identifier for the conversationcurrentContext
: Current conversation context to compare againstllm
: Name of the LLM (e.g., 'chatgpt', 'claude')
-
create-summary: Create a summary of context items and link them to the summary
conversationId
: Unique identifier for the conversationcontextItems
: Context items to summarizesummaryText
: Human-provided summary textllm
: Name of the LLM (e.g., 'chatgpt', 'claude')userId
: Optional user identifier
-
get-conversation-summaries: Get all summaries for a specific conversation
conversationId
: Unique identifier for the conversation
-
search-context-by-tags: Search archived context and summaries by tags
tags
: Tags to search for
-
Save all memories (overwrites existing):
User: "Save all my memories from this conversation to the MCP server" LLM: [Uses save-memories tool with current conversation memories]
-
Retrieve all memories:
User: "Get all my memories from the MCP server" LLM: [Uses get-memories tool to retrieve stored memories]
-
Archive context when window gets full:
User: "The conversation is getting long, archive the early parts" LLM: [Uses archive-context tool to store old messages with tags]
-
Score relevance of archived content:
User: "How relevant is the archived content to our current discussion?" LLM: [Uses score-relevance tool to evaluate archived content]
-
Retrieve relevant archived context:
User: "Bring back the relevant archived information" LLM: [Uses retrieve-context tool to get relevant archived content]
-
Create summaries for long conversations:
User: "Summarize the early parts of our conversation" LLM: [Uses create-summary tool to condense archived content]
The ConversationOrchestrator
class provides automatic context window management:
- Automatic Archiving: Archives content when context usage reaches 80%
- Intelligent Retrieval: Retrieves relevant content when usage drops below 30%
- Relevance Scoring: Uses keyword overlap to score archived content relevance
- Smart Tagging: Automatically generates tags based on content keywords
- Conversation State Management: Tracks active conversations and their context
- Recommendations: Provides suggestions for optimal context management
import { ConversationOrchestrator } from "./orchestrator.js";
const orchestrator = new ConversationOrchestrator(8000); // 8k word limit
// Add a message (triggers automatic archiving/retrieval)
const result = await orchestrator.addMessage(
"conversation-123",
"This is a new message in the conversation",
"claude",
);
// Check if archiving is needed
if (result.archiveDecision?.shouldArchive) {
await orchestrator.executeArchive(result.archiveDecision, result.state);
}
// Check if retrieval is needed
if (result.retrievalDecision?.shouldRetrieve) {
await orchestrator.executeRetrieval(result.retrievalDecision, result.state);
}
type BasicMemory = {
_id: ObjectId;
memories: string[]; // Array of memory strings
timestamp: Date; // When memories were saved
llm: string; // LLM identifier (e.g., 'chatgpt', 'claude')
userId?: string; // Optional user identifier
};
type ExtendedMemory = {
_id: ObjectId;
memories: string[]; // Array of memory strings
timestamp: Date; // When memories were saved
llm: string; // LLM identifier
userId?: string; // Optional user identifier
conversationId?: string; // Unique conversation identifier
contextType?: "active" | "archived" | "summary";
relevanceScore?: number; // 0-1 relevance score
tags?: string[]; // Categorization tags
parentContextId?: ObjectId; // Reference to original content for summaries
messageIndex?: number; // Order within conversation
wordCount?: number; // Size tracking
summaryText?: string; // Condensed version
};
The orchestration system automatically:
- Monitors conversation length and context usage
- Archives content when context usage reaches 80%
- Scores relevance of archived content against current context
- Retrieves relevant content when usage drops below 30%
- Creates summaries to condense very long conversations
- Conversation Grouping: All archived content is linked to specific conversation IDs
- Relevance Scoring: Simple keyword overlap scoring (can be enhanced with semantic similarity)
- Tag-based Organization: Categorize content for easy retrieval
- Summary Linking: Preserve links between summaries and original content
- Backward Compatibility: All existing memory functions work unchanged
- Automatic Management: No manual intervention required for basic operations
To run in development mode:
npm run build
node build/index.js
To run the CLI demo:
npm run cli
ISC