Skip to content

DeepWism R2 is a next-generation AGI system built on the T3CEDS framework (Thin-Thick-Thin Crowd Entropy Dynamics System), which redefines intelligence as a process of entropy reduction rather than attention modeling.

Notifications You must be signed in to change notification settings

DeepWism/DeepWism-R2

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 

Repository files navigation

DeepWism® R2

Homepage Chat
Twitter Follow WeChat



1. Introduction

We introduce DeepWism® R2(Research&Report), a revolutionary next-generation AI system based on the agents framework - a Thin-Thick-Thin Crowd Entropy Dynamics System (T3CEDS) that establishes entropy reduction as the fundamental mechanism underlying crowd intelligence. This represents a paradigm shift in AI research, moving from traditional attention-based mechanisms to entropy management as the core design principle. DeepWism® R2 innovative architecture consists of three distinct yet interconnected layers:

Thin Perception Layer: Efficiently captures high-dimensional inputs while preserving essential information and reducing input entropy

Thick Processing Layer: Leverages crowd intelligence mechanisms to actively reduce entropy through structured reasoning, collaborative processing, and deep ranking

Thin Decision Layer: Distills complex representations into coherent outputs, further reducing entropy to facilitate clear, actionable decisions

DeepWism® R2 achieves 27.5% accuracy on Humanity's Last Exam (HLE), surpassing OpenAI DeepResearch's 26.6% and establishing new state-of-the-art performance. In addition, it leads on xbench-ScienceQA with 70.0% and xbench-DeepSearch with 64.0%, outperforming all other models.

These results demonstrate superior capabilities in:

Complex Problem-Solving: Enhanced reasoning through entropy reduction mechanisms

Uncertainty Management: Effective handling of high-entropy scenarios

Multi-Domain Generalization: Robust performance across diverse problem domains including science, deep retrieval, and logic

AI Explainability: Transparent decision-making processes through entropy dynamics

2. Summary

Revolutionary Architecture: Thin-Thick-Thin Crowd Entropy Dynamics System

  • Entropy-Centric Design: Unlike traditional attention-based models, DeepWism® R2 is fundamentally designed around entropy reduction principles. This paradigm shift enables more effective handling of complex, uncertain problem spaces that challenge conventional AI systems.

  • Crowd Intelligence Integration: The thick processing layer implements sophisticated crowd intelligence mechanisms that leverage collective reasoning patterns to systematically reduce entropy through collaborative processing and structured analysis.

  • T3CEDS Framework: The three-layer architecture optimizes information flow from high-entropy inputs to low-entropy, actionable outputs, ensuring maximum coherence and decision clarity at each stage.

Performance Excellence: State-of-the-Art Results

  • Humanity's Last Exam Leadership: Achieving 27.5% accuracy on HLE, DeepWism® R2 sets new benchmarks in complex reasoning tasks, demonstrating superior performance over leading models including OpenAI DeepResearch, Gemini2.5, and Claude 3.7 Sonnet.

  • Multi-Benchmark Dominance: Consistent excellence across diverse evaluation metrics including AIME2025 (93.3%), GPQA-Diamond (88.0%), MMLU (86.0%), and SWE-bench Verified (72.0%), showcasing robust generalization capabilities.

  • Open Research Commitment: DeepWism® R2 framework will be fully open-sourced with global API access for authenticated users, fostering collaborative advancement in entropy-based AI research.

3. Evaluation Results

Comprehensive Benchmark Performance DeepWism® R2 demonstrates exceptional performance across multiple challenging benchmarks, establishing new state-of-the-art results in complex reasoning and problem-solving tasks.

Model xbench-ScienceQA xbench-DeepSearch HLE AIME2025 GPQA-Diamond MMLU SWE-bench Verified
DeepWism® R2 70.0 64.0 27.5 93.3 88.0 86.0 72.0
o3-high 60.8 65.0 20.3
o3-pro 59.6
Doubao-Seed-1.6 56.6 50.0
OpenAI DeepResearch 26.6
Gemini 2.5 Pro 59.4 50.0 21.6 88.0 86.4 84.5 67.2
o4 mini 50.4 60.0 18.1 92.7 81.4 82.0 68.1
Claude 4 Opus 10.7 90.0 83.3 80.7 79.4
DeepSeek-R1-0528 54.6 14.0 87.5 81.0 84.0 57.6

🔑 Key Performance Highlights

  • 🏆 Humanity's Last Exam (HLE): 27.5% accuracy, surpassing all existing models including OpenAI DeepResearch
  • 🧪 xbench-ScienceQA: 70.0% accuracy, leading all models on complex scientific question answering
  • 🔍 xbench-DeepSearch: 64.0% accuracy, demonstrating top-tier deep retrieval and reasoning capabilities
  • 📊 AIME2025: 93.3% accuracy, showcasing exceptional mathematical reasoning capabilities
  • 🔬 GPQA-Diamond: 88.0% accuracy, indicating superior performance in graduate-level science questions
  • 📚 MMLU: 86.0% accuracy, reflecting strong multi-domain knowledge understanding
  • 💻 SWE-bench Verified: 72.0% accuracy, proving highly effective software engineering problem-solving skills

4. Chat Website

Experience DeepWism® R2's revolutionary capabilities through our interactive platforms:

🌐 Chat Interface: i.deepwism.com

  • Real-time interaction with DeepWism® R2
  • Entropy visualization in reasoning processes
  • Multi-domain problem-solving capabilities

5. Contact

For questions, collaborations, or support ,please contact us at: r2@deepwism.com

Website: www.deepwism.com

GitHub Issues: Report bugs or request features Twitter: @DeepWism

Advancing Next Generation AI through Entropy Reduction and Crowd Intelligence

DeepWism® AI © 2025

About

DeepWism R2 is a next-generation AGI system built on the T3CEDS framework (Thin-Thick-Thin Crowd Entropy Dynamics System), which redefines intelligence as a process of entropy reduction rather than attention modeling.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published