Skip to content

Priyanshu-i/ContextCore

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ContextCore: Enhanced Context Management for Local LLMs

ContextCore is a Python library designed to overcome context window limitations in smaller local LLMs like smollm2:1.7b from Ollama. It implements a unified memory system that combines high-level thinking memory with detailed raw memory to provide extended context capabilities.

Installation

# Clone the repository
git clone https://github.com/Priyanshu-i/ContextCore.git
cd contextcore

# Install the package
pip install -e .

# Optional but recommended dependencies
pip install sentence-transformers  # For better embeddings
pip install hnswlib  # For vector storage
pip install redis  # For faster key-value storage
pip install requests  # For Ollama API communication

Quick Start

from contextcore import ContextCore

# Initialize ContextCore with your local LLM
context = ContextCore(
    model_name="smollm2:1.7b",  # Your Ollama model
    ollama_url="http://localhost:11434"  # Ollama API URL
)

# Initialize a new session with an objective
context.initialize_session("Building a robust memory system for local LLMs")

# Process user inputs and get responses
response = context.process_user_input("How can I implement a vector store for text embeddings?")
print(response)

# Save the session for later use
context.save()

# Load a saved session
loaded_context = ContextCore.load("./contextcore_storage")

Key Features

  1. Two-Tier Memory System:

    • Thinking Memory (TME): High-level reasoning, concepts, and session strategies
    • Raw Memory (RME): Detailed facts, user inputs, and specific technical information
  2. Semantic Search: Find relevant memories based on semantic similarity

  3. Session Management: Maintain coherent, ongoing conversations with automatic summarization

  4. Local LLM Integration: Seamless integration with Ollama-based local models

  5. Persistence: Save and load sessions to continue conversations later

Advanced Usage

Customizing Memory Storage

# Use Redis for faster key-value storage
context = ContextCore(
    model_name="smollm2:1.7b",
    use_redis=True  # Enable Redis storage
)

# Customize vector dimensions (if using a different embedding model)
context = ContextCore(
    model_name="smollm2:1.7b",
    vector_dim=768  # For larger embedding models
)

Working with Different LLMs

# Use a different Ollama model
context = ContextCore(
    model_name="llama3:8b",  # Any model you have in Ollama
)

# Connect to a remote Ollama instance
context = ContextCore(
    model_name="mistral:7b",
    ollama_url="http://your-ollama-server:11434"
)

Memory Management

# Manually add thinking memory
context.memory_store.add_thinking_memory(
    content="The key insight is to use hierarchical summarization",
    importance=0.9,
    metadata={"topic": "architecture", "source": "design_doc"}
)

# Manually add raw memory
context.memory_store.add_raw_memory(
    content="User prefers Python over JavaScript for this project",
    category="user",  # user, session, or agent
    relevance_score=0.7,
    metadata={"source": "conversation"}
)

# Search memories
memories = context.memory_store.search_memories(
    query="vector databases",
    k=5,  # Return top 5 results
    filter_type="raw",  # Only raw memories
    min_score=0.6  # Minimum similarity threshold
)

Implementation Details

Memory Types

  1. ThinkingMemory: Used for high-level concepts and reasoning

    • Contains: content, timestamp, importance score, metadata
  2. RawMemory: Used for detailed facts and specific information

    • Contains: content, timestamp, category, relevance score, metadata

Components

  1. VectorStore: Stores and retrieves memories using semantic search

    • Uses HNSWlib for efficient similarity search
  2. SimpleEmbedder: Converts text to vector embeddings

    • Uses sentence-transformers if available, with a simple fallback
  3. MemoryStore: Combines vector storage with metadata-based retrieval

    • Optional Redis integration for faster lookups
  4. OllamaClient: Interfaces with Ollama API for text generation

  5. ContextCore: Main class that coordinates all components

Best Practices

  1. Initialization:

    • Always provide a clear session objective
    • Use the most powerful local LLM you have available
  2. Memory Management:

    • Let the system handle memory management automatically
    • For critical information, manually add high-importance memories
  3. Performance Optimization:

    • Install sentence-transformers for better embeddings
    • Use Redis for faster key-value lookups in production
  4. Troubleshooting:

    • Check Ollama is running and the model is loaded
    • Ensure you have sufficient RAM for vector operations
    • Look at the logs for detailed information about operations

Memory System Architecture

ContextCore implements a unified memory system that combines:

  1. Hierarchical Summarization: Continuously distills conversation into structured summaries
  2. Incremental Updates: Updates high-level summaries with new insights
  3. Semantic Retrieval: Fetches the most relevant detailed memories
  4. Dynamic Injection: Combines high-level thinking with detailed context

This approach enables small local LLMs to maintain coherent conversations even when the raw input exceeds their context window.

About

Ultimate memory layer for local LLM's ~ mitigate limitations of Context Window.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages