SII - Generative Artificial Intelligence Research Lab (GAIR)

All

47 repositories

ResearcherBench
Public
ResearcherBench: Evaluating Deep AI Research Systems on the Frontiers of Scientific Inquiry
Python
•2•20•1•1•Updated Aug 1, 2025Aug 1, 2025
LIMO
Public
[COLM 2025] LIMO: Less is More for Reasoning
Python
•51•993•0•0•Updated Jul 30, 2025Jul 30, 2025
ASI-Arch
Public
AlphaGo Moment for Model Architecture Discovery.
Python
•
Apache License 2.0
•149•858•3•0•Updated Jul 29, 2025Jul 29, 2025
ASI4AI
Public
JavaScript
•2•5•0•0•Updated Jul 23, 2025Jul 23, 2025
lm-open-science-evaluation
Public
Reproducible and flexible LLM evaluations for scientific reasoning.
science evaluation reasoning llm scientific-reasoning
Python
•
Apache License 2.0
•0•17•0•0•Updated Jul 23, 2025Jul 23, 2025
MegaScience
Public
MegaScience: Pushing the Frontiers of Post-Training Datasets for Science Reasoning
science llama reasoning post-training llm qwen scientific-reasoning
Python
•
Apache License 2.0
•4•75•0•0•Updated Jul 23, 2025Jul 23, 2025
OctoThinker
Public
Revisiting Mid-training in the Era of Reinforcement Learning Scaling
rl llama reasoning post-training pre-training llm qwen verl mid-training
Jupyter Notebook
•
Apache License 2.0
•12•160•3•0•Updated Jul 23, 2025Jul 23, 2025
ProX
Public
[ICML 2025] Programming Every Example: Lifting Pre-training Data Quality Like Experts at Scale
llama data-quality mistral pre-training continual neural-symbolic data-centric-ai llm continual-pre-training
Python
•
Apache License 2.0
•19•255•1•0•Updated Jul 8, 2025Jul 8, 2025
anole
Public
Anole: An Open, Autoregressive and Native Multimodal Models for Interleaved Image-Text Generation
Python
•48•780•33•1•Updated Jun 16, 2025Jun 16, 2025
PC-Agent-E
Public
Efficient Agent Training for Computer Use
Python
•
MIT License
•3•121•0•0•Updated Jun 6, 2025Jun 6, 2025
thinking-with-generated-images
Public
Doodling our way to AGI ✏️ 🖼️ 🧠
Python
•2•79•1•0•Updated May 29, 2025May 29, 2025
LIMOPro
Public
Python
•
Apache License 2.0
•0•11•1•0•Updated May 27, 2025May 27, 2025
DynToM
Public
Python
•0•7•0•0•Updated May 26, 2025May 26, 2025
WindowsAgentArena-V2
Public
0•0•1•0•Updated May 25, 2025May 25, 2025
ToRL
Public
Python
•10•258•17•0•Updated May 24, 2025May 24, 2025
PC-Agent
Public
PC Agent: While You Sleep, AI Works - A Cognitive Journey into Digital World
Python
•
MIT License
•22•274•1•0•Updated May 21, 2025May 21, 2025
cognition-engineering
Public
Generative AI Act II: Test Time Scaling Drives Cognition Engineering
Python
•9•198•1•0•Updated Apr 22, 2025Apr 22, 2025
DeepResearcher
Public
Scaling Deep Research via Reinforcement Learning in Real-world Environments.
Python
•
Apache License 2.0
•41•538•9•0•Updated Apr 13, 2025Apr 13, 2025
MAYE
Public
Rethinking RL Scaling for Vision Language Models: A Transparent, From-Scratch Framework and Comprehensive Evaluation Scheme
Python
•8•138•3•0•Updated Apr 9, 2025Apr 9, 2025
simple_tts
Public
Python
•2•0•0•0•Updated Apr 5, 2025Apr 5, 2025
MathPile
Public
[NeurlPS D&B 2024] Generative AI for Math: MathPile
math corpus language-model pre-training large-language-models
Python
•
Apache License 2.0
•21•415•0•0•Updated Apr 4, 2025Apr 4, 2025
cs2916
Public
Python
•15•26•0•0•Updated Mar 27, 2025Mar 27, 2025
AIME-Preview
Public
Python
•6•71•2•0•Updated Mar 11, 2025Mar 11, 2025
OlympicArena
Public
[NeurIPS 2024] OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI
JavaScript
•5•102•0•0•Updated Mar 6, 2025Mar 6, 2025
LIMR
Public
Python
•8•205•15•0•Updated Feb 20, 2025Feb 20, 2025
O1-Journey
Public
O1 Replication Journey
66•2k•14•0•Updated Jan 14, 2025Jan 14, 2025
ReasonEval
Public
[AAAI 2025 oral] Evaluating Mathematical Reasoning Beyond Accuracy
Python
•3•63•1•0•Updated Dec 15, 2024Dec 15, 2024
walnut-plan
Public
The Walnut Plan
0•11•0•0•Updated Oct 10, 2024Oct 10, 2024
OpenResearcher
Public
OpenResearcher, an advanced Scientific Research Assistant
HTML
•
Apache License 2.0
•39•452•1•2•Updated Oct 10, 2024Oct 10, 2024
math-evaluation-harness
Public
A simple toolkit for benchmarking LLMs on mathematical reasoning tasks. 🧮✨
Python
•
MIT License
•18•2•0•0•Updated Oct 6, 2024Oct 6, 2024