This repository contains the official implementation of the research paper:
"PBFT-Backed Semantic Voting for Multi-Agent Memory Pruning"
Duong Bach
arXiv:2506.17338 [cs.DC], 2025
The proliferation of multi-agent systems (MAS) in complex, dynamic environments necessitates robust and efficient mechanisms for managing shared knowledge. This implementation addresses the critical challenge of ensuring that distributed memories remain synchronized, relevant, and free from outdated data through a novel Co-Forgetting Protocol.
- Context-Aware Semantic Voting: Lightweight DistilBERT-based relevance assessment for memory items
- Multi-Scale Temporal Decay: Sophisticated aging functions across different time horizons
- PBFT Consensus Mechanism: Byzantine fault-tolerant memory pruning decisions (tolerates up to f faulty agents in 3f+1 systems)
- Experimental Validation: Demonstrated 52% memory reduction, 88% voting accuracy, 92% consensus success rate
- Clone the repository:
git clone https://github.com/DngBack/co-forget-protocol.git
cd co-forget-protocol
- Install dependencies:
pip install -r requirements.txt
- Generate gRPC code:
python -m grpc_tools.protoc -I. --python_out=. --grpc_python_out=. src/co_forget_protocol/pbft.proto
- Set up environment variables:
# Create .env file
PINECONE_API_KEY=your_api_key_here
PINECONE_ENVIRONMENT=us-west1-gcp
PINECONE_INDEX_NAME=memories
Reproduce the paper's experimental results:
from co_forget_protocol import Protocol, Settings
from co_forget_protocol.config import PineconeConfig, MemoryConfig, AgentConfig
# Configure experimental setup
settings = Settings(
pinecone=PineconeConfig(
api_key="your_api_key",
environment="us-west1-gcp",
index_name="memories"
),
memory=MemoryConfig(
max_memories=10000,
cache_size=100,
batch_size=100
),
agent=AgentConfig(
num_memory_managers=3, # 3f+1 = 4 agents (tolerates 1 Byzantine)
llm_model="distilbert-base-uncased",
relevance_threshold=0.7,
max_faulty=1
)
)
# Initialize protocol
protocol = Protocol(settings)
# Run experimental evaluation
questions = [
"What is the capital of France?",
"Who is the CEO of Tesla?",
"What is the population of Tokyo?"
]
# Execute protocol vs baseline comparison
protocol_answers, baseline_answers = await protocol.run(questions)
print("Protocol Results:", protocol_answers)
print("Baseline Results:", baseline_answers)
This implementation follows the paper's three-component design:
- DistilBERT Integration: Lightweight transformer for semantic similarity assessment
- Relevance Scoring: Context-aware memory importance evaluation
- Text Classification: Binary relevance decisions with configurable thresholds
- Time-Based Decay: Exponential decay functions across multiple time horizons
- Access Frequency: LRU-based importance weighting
- Combined Scoring: Unified relevance + temporal decay metric
- Three-Phase Protocol: Pre-prepare, prepare, commit phases
- Byzantine Fault Tolerance: Handles up to f malicious agents in 3f+1 system
- gRPC Communication: Efficient inter-agent message passing
- Message Authentication: Cryptographic signatures for message integrity
- Vector Database: Pinecone for embedding storage and similarity search
- Metadata Storage: SQLite for local memory metadata management
- Caching Layer: Multi-level caching (LRU + TTL) for performance optimization
- Batch Processing: Efficient bulk operations for memory updates
The implementation achieves the following performance metrics as reported in the paper:
Metric | Paper Result | Implementation Status |
---|---|---|
Memory Footprint Reduction | 52% over 500 epochs | ✅ Reproduced |
Voting Accuracy | 88% vs human benchmarks | ✅ Reproduced |
PBFT Consensus Success Rate | 92% under Byzantine conditions | ✅ Reproduced |
Cache Hit Rate | 82% for memory access | ✅ Reproduced |
# Run full experimental suite
python exp/co-forget-protocol.py
# Run specific test scenarios
pytest tests/test_protocol.py -v
# Performance profiling
python -m cProfile exp/co-forget-protocol.py
Key configuration classes mirror the paper's experimental setup:
class Settings:
pinecone: PineconeConfig # Vector database configuration
memory: MemoryConfig # Memory management parameters
agent: AgentConfig # Multi-agent system settings
class AgentConfig:
num_memory_managers: int = 3 # Number of memory management agents
llm_model: str = "distilbert-base-uncased" # Semantic model
relevance_threshold: float = 0.7 # Voting threshold
max_faulty: int = 1 # Byzantine fault tolerance (f parameter)
class MemoryConfig:
max_memories: int = 10000 # Maximum memory capacity
cache_size: int = 100 # LRU cache size
batch_size: int = 100 # Batch processing size
decay_rate: float = 0.95 # Temporal decay parameter
# Run unit tests
pytest tests/ -v
# Test PBFT consensus under Byzantine conditions
pytest tests/test_protocol.py::test_byzantine_tolerance -v
# Validate semantic voting accuracy
pytest tests/test_protocol.py::test_voting_accuracy -v
# Performance benchmarks
python tests/benchmark.py
If you use this implementation in your research, please cite the original paper:
@article{bach2025pbft,
title={PBFT-Backed Semantic Voting for Multi-Agent Memory Pruning},
author={Bach, Duong},
journal={arXiv preprint arXiv:2506.17338},
year={2025}
}
This is an academic implementation. Contributions that:
- Improve experimental reproducibility
- Add new evaluation metrics
- Optimize performance while maintaining accuracy
- Extend to new domains/datasets
are welcome. Please follow the paper's methodology and maintain experimental rigor.
- Memory Requirements: ~2GB RAM for full experimental setup
- Compute: GPU recommended for DistilBERT inference (CPU compatible)
- Network: gRPC requires stable network for multi-agent communication
- Storage: ~1GB for full experimental dataset and vector embeddings
- gRPC Connection Errors: Ensure all agents can communicate on specified ports
- Pinecone API Limits: Monitor API quota and implement rate limiting
- Memory Overflow: Adjust
max_memories
andcache_size
for available RAM - Byzantine Behavior: Verify
max_faulty < (num_agents - 1) / 3
import logging
logging.basicConfig(level=logging.DEBUG)
# Enable detailed PBFT logging
protocol = Protocol(settings, debug=True)
MIT License - see LICENSE file for details.
- Practical Byzantine Fault Tolerance - Castro & Liskov, 1999
- DistilBERT - Sanh et al., 2019
- Multi-Agent Memory Systems - Recent surveys in distributed AI
For questions about this implementation or the research paper:
- Author: Duong Bach
- Repository: https://github.com/DngBack/co-forget-protocol
- Paper: arXiv:2506.17338
Note: This is a research implementation. For production use, additional security hardening and optimization may be required.