Cognitive Workspace - Proof of Concept Implementation

Quick Start

1. Install Dependencies

# Basic dependencies
pip install numpy

# Optional: OpenAI support
pip install openai python-dotenv

# Optional: Better vector embeddings
pip install sentence-transformers

# Optional: Enhanced experiments (statistical analysis and visualization)
pip install scipy matplotlib

2. Environment Configuration

Create a .env file:

# OpenAI Official API
OPENAI_API_KEY=sk-your-key-here
OPENAI_API_BASE=https://api.openai.com/v1
OPENAI_MODEL=gpt-3.5-turbo

# Or use Azure OpenAI
# OPENAI_API_KEY=your-azure-key
# OPENAI_API_BASE=https://your-resource.openai.azure.com
# OPENAI_MODEL=your-deployment-name

# Or use local models (e.g., Ollama)
# OPENAI_API_BASE=http://localhost:11434/v1
# OPENAI_MODEL=llama2

3. Run Experiments

# Basic experiment (4-round dialogue)
python cognitive_workspace_poc.py

# Enhanced experiments (10-round dialogue + multi-hop reasoning + conflict resolution)
python cognitive_workspace_enhanced.py

Operation Modes

Mode 1: Full Mode (Recommended)

Requires OpenAI API key, demonstrates real LLM behavioral differences:

Higher quality task decomposition
More accurate information prediction
More coherent answer generation

Mode 2: Simulation Mode (Default)

No API key required, uses rule-based simulation:

Still demonstrates architectural differences
Suitable for proof-of-concept
Fully reproducible

Mode 3: Local Mode

Uses local models like Ollama:

Data privacy
No API costs
Performance depends on local hardware

Experiment Content

Experiment 1: Single-turn Task Processing

Compares Cognitive Workspace vs traditional RAG on single complex questions:

Operation count difference (12 vs 3)
Operation type difference (active vs passive)
Memory management difference (hierarchical vs flat)
Single-turn memory reuse rate: 50% vs 0%

Experiment 2: Multi-turn Dialogue (Core Advantage)

Demonstrates cumulative advantages from state persistence:

Round  CW Reuse Rate  RAG Reuse Rate
1      50.0%         0%
2      55.0%         0%
3      56.7%         0%
4      56.4%         0%

Average reuse rate: 54.5% vs 0%

Experiment 3: 10-round Extended Dialogue (Enhanced)

Memory advantages in long-term conversations:

Average reuse rate: 57.1% vs 0%
Net efficiency gain: 17.3%
Cohen's d: 23.2 (huge effect)
P-value: < 0.001 (extremely significant)

Experiment 4: Multi-hop Reasoning (Enhanced)

Advantages in complex reasoning chains:

Average reuse rate: 58.8% vs 0%
Net efficiency gain: 17.9%
Cohen's d: 190.0 (extremely large effect)
Operations saved: 194

Experiment 5: Information Conflict Resolution (Enhanced)

Performance when handling contradictory information:

Average reuse rate: 59.8% vs 0%
Net efficiency gain: 17.8%
Cohen's d: 195.7 (extremely large effect)
Operations saved: 226

Output Files

cognitive_workspace_results.json: Basic experiment results
enhanced_results.json: Enhanced experiment detailed results
cognitive_workspace_analysis.png: Experiment visualization charts
.env.example: Environment variable template (if .env doesn't exist)

Key Metrics Explanation

Memory Reuse Rate (Measured Data)

Basic experiment (4 rounds): Average 54.5%, reuse starts from round 1
10-round dialogue: Average 57.1%, long-term dialogue advantage clear
Multi-hop reasoning: Average 58.8%, higher reuse rate for complex tasks
Conflict resolution: Average 59.8%, best performance in information integration scenarios
Traditional RAG: Always 0% (stateless)

Net Efficiency Gain (After considering extra overhead)

Net efficiency = Reuse rate / (1 + Extra operation ratio)

10-round dialogue: 17.3% net improvement
Multi-hop reasoning: 17.9% net improvement
Conflict resolution: 17.8% net improvement

Statistical Significance

P-values: All experiments < 0.001 (extremely significant)
Cohen's d effect size:
- 10-round dialogue: 23.2 (huge)
- Multi-hop reasoning: 190.0 (extremely large)
- Conflict resolution: 195.7 (extremely large)

Operation Growth Patterns

Cognitive Workspace: Sub-linear growth (reduces redundant computation through memory reuse)
Traditional RAG: Linear growth (starts fresh for each query)

Confidence Tracking

Cognitive Workspace: Dynamically tracks task completion and information sufficiency
Traditional RAG: No confidence concept

Paper Support

This code supports the following paper arguments:

Active memory management outperforms passive retrieval
- Code proof: Task decomposition, information prediction, active preparation
State persistence improves efficiency
- Code proof: Memory reuse in multi-turn dialogues
Hierarchical buffers optimize resource utilization
- Code proof: immediate→working→episodic promotion mechanism
Metacognitive control enhances intelligence
- Code proof: Confidence tracking, information gap identification

FAQ

Q: Why can simulation mode also prove the points?

A: Because we prove architectural behavioral differences, not generation quality. Even with rule simulation, the differences between active vs passive, stateful vs stateless are still obvious.

Q: How to cite this code in papers?

A:

Code available at: \url{https://github.com/tao-hpu/cognitive-workspace}

Q: How many tokens/API calls are needed?

A: Full experiments require approximately:

Single-turn experiment: ~10 API calls
Multi-turn experiment: ~20 API calls
Total cost: < $0.05 (using GPT-3.5-turbo)

Q: Can other LLMs be used?

A: Yes! The code supports:

OpenAI-compatible APIs (by modifying OPENAI_API_BASE)
Local models (Ollama, llama.cpp)
Any service providing chat/completion interfaces

Extension Suggestions

Add longer-term tests (20+ rounds)

# Modify question list in enhanced_experiment.py
extended_questions = [...20 questions...]

Integrate real vector databases

# Use ChromaDB or Pinecone
from chromadb import Client

Add more statistical tests

# Mann-Whitney U test, Friedman test, etc.
from scipy import stats
stats.mannwhitneyu(cw_results, rag_results)

Performance benchmarking

# Test performance at different scales
for doc_count in [10, 100, 1000]:
    test_scalability(doc_count)

Citation

If you use this code, please cite:

@article{an2025cognitive,
  title={Cognitive Workspace: Towards Functional Infinite Context Through Active Memory Management},
  author={Tao An},
  year={2025},
  eprint={2508.13171},
  archivePrefix={arXiv},
  primaryClass={cs.AI}
}

License

MIT License - Free to use, modify and distribute

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
paper		paper
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
README_CN.md		README_CN.md
cognitive_workspace_analysis.png		cognitive_workspace_analysis.png
cognitive_workspace_enhanced.py		cognitive_workspace_enhanced.py
cognitive_workspace_poc.py		cognitive_workspace_poc.py
cognitive_workspace_results.json		cognitive_workspace_results.json
enhanced_results.json		enhanced_results.json

License

tao-hpu/cognitive-workspace

Folders and files

Latest commit

History

Repository files navigation

Cognitive Workspace - Proof of Concept Implementation

Quick Start

1. Install Dependencies

2. Environment Configuration

3. Run Experiments

Operation Modes

Mode 1: Full Mode (Recommended)

Mode 2: Simulation Mode (Default)

Mode 3: Local Mode

Experiment Content

Experiment 1: Single-turn Task Processing

Experiment 2: Multi-turn Dialogue (Core Advantage)

Experiment 3: 10-round Extended Dialogue (Enhanced)

Experiment 4: Multi-hop Reasoning (Enhanced)

Experiment 5: Information Conflict Resolution (Enhanced)

Output Files

Key Metrics Explanation

Memory Reuse Rate (Measured Data)

Net Efficiency Gain (After considering extra overhead)

Statistical Significance

Operation Growth Patterns

Confidence Tracking

Paper Support

FAQ

Q: Why can simulation mode also prove the points?

Q: How to cite this code in papers?

Q: How many tokens/API calls are needed?

Q: Can other LLMs be used?

Extension Suggestions

Citation

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages