- 📖 Awesome-Large-Search-Models contains the latest paper and blogs about search-oriented large (reasoning) language models (large search models). Example papers include reinforcement learning-based methods. This repo also has other resources like datasets and popular frameworks.
- 🌟 Please consider starring us if our repo is helpful.
- 📮 Feel free to open an issue or pull a request if you think I missed some work.
- Jun 9, 2025: We create this repo to include papers and resources on search-oriented large language models!
Time | Title | Venue | Paper | Code |
---|---|---|---|---|
2025.06 | MMSearch-R1: Incentivizing LMMs to Search | arXiv | Link | Link |
2025.06 | CIIR@LiveRAG 2025: Optimizing Multi-Agent Retrieval Augmented Generation through Self-Training | arXiv | Link | - |
2025.06 | Knowledgeable-r1: Policy Optimization for Knowledge Exploration in Retrieval-Augmented Generation | arXiv | Link | - |
2025.06 | Coordinating Search-Informed Reasoning and Reasoning-Guided Search in Claim Verification | arXiv | Link | - |
2025.06 | R-Search: Empowering LLM Reasoning with Search via Multi-Reward Reinforcement Learning | arXiv | Link | Link |
2025.05 | WebDancer: Towards Autonomous Information Seeking Agency | arXiv | Link | Link |
2025.05 | Pangu DeepDiver: Adaptive Search Intensity Scaling via Open-Web Reinforcement Learning | arXiv | Link | - |
2025.05 | R3-RAG: Learning Step-by-Step Reasoning and Retrieval for LLMs via Reinforcement Learning | arXiv | Link | Link |
2025.06 | Learning to Route Queries Across Knowledge Bases for Step-wise Retrieval-Augmented Reasoning | arXiv | Link | Link |
2025.05 | VRAG-RL: Empower Vision-Perception-Based RAG for Visually Rich Information Understanding via Iterative Reasoning with Reinforcement Learning | arXiv | Link | Link |
2025.05 | EvolveSearch: An Iterative Self-Evolving Search Agent | arXiv | Link | - |
2025.05 | MaskSearch: A Universal Pre-Training Framework to Enhance Agentic Search Capability | arXiv | Link | Link |
2025.05 | LeTS: Learning to Think-and-Search via Process-and-Outcome Reward Hybridization | arXiv | Link | - |
2025.05 | Search Wisely: Mitigating Sub-optimal Agentic Searches By Reducing Uncertainty | arXiv | Link | - |
2025.05 | R1-Searcher++: Incentivizing the Dynamic Knowledge Acquisition of LLMs via Reinforcement Learning | arXiv | Link | Link |
2025.05 | SimpleDeepSearcher: Deep Information Seeking via Web-Powered Reasoning Trajectory Synthesis | arXiv | Link | Link |
2025.05 | arXiv | Link | Link | |
2025.05 | An Empirical Study on Reinforcement Learning for Reasoning-Search Interleaved LLM Agents | arXiv | Link | Link |
2025.05 | StepSearch: Igniting LLMs Search Ability via Step-Wise Proximal Policy Optimization | arXiv | Link | Link |
2025.05 | Process vs. Outcome Reward: Which is Better for Agentic RAG Reinforcement Learning | arXiv | Link | Link |
2025.05 | Search and Refine During Think: Autonomous Retrieval-Augmented Reasoning of LLMs | arXiv | Link | Link |
2025.05 | Scent of Knowledge: Optimizing Search-Enhanced Reasoning with Information Foraging | arXiv | Link | - |
2025.05 | ZeroSearch: Incentivize the Search Capability of LLMs without Searching | arXiv | Link | Link |
2025.04 | WebThinker: Empowering Large Reasoning Models with Deep Research Capability | arXiv | Link | Link |
2025.04 | ReZero: Enhancing LLM search ability by trying one-more-time | arXiv | Link | - |
2025.04 | DeepResearcher: Scaling Deep Research via Reinforcement Learning in Real-world Environments | arXiv | Link | Link |
2025.03 | Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning | arXiv | Link | Link |
2025.02 | DeepRetrieval: Hacking Real Search Engines and Retrievers with Large Language Models via Reinforcement Learning | arXiv | Link | Link |
Time | Title | Venue | Paper | Code |
---|---|---|---|---|
2025.05 | ManuSearch: Democratizing Deep Search in Large Language Models with a Transparent and Open Multi-Agent Framework | arXiv | Link | Link |
2025.01 | Search-o1: Agentic Search-Enhanced Large Reasoning Models | arXiv | Link | Link |
2022.10 | ReAct: Synergizing Reasoning and Acting in Language Models | ICLR'2023 | Link | Link |
2022.12 | Interleaving Retrieval with Chain-of-Thought Reasoning for Knowledge-Intensive Multi-Step Questions | ACL'2023 | Link | Link |
2022.10 | Decomposed Prompting: A Modular Approach for Solving Complex Tasks | ICLR'2023 | Link | Link |
Name | Type | Link |
---|---|---|
NQ | One-hop QA | Link |
TriviaQA | One-hop QA | Link |
PopQA | One-hop QA | Link |
SQuAD | One-hop QA | Link |
CommonSenseQA | One-hop QA | Link |
HotpotQA | Multi-hop QA | Link |
Bamboogle | Multi-hop QA | Link |
2WikiMultiHopQA | Multi-hop QA | Link |
Musique | Multi-hop QA | Link |
Time | Title | Venue | Paper |
---|---|---|---|
2025.06 | Reasoning RAG via System 1 or System 2: A Survey on Reasoning Agentic Retrieval-Augmented Generation for Industry Challenges | arXiv | Link |
- OpenRLHF: https://github.com/OpenRLHF/OpenRLHF
- FlashRAG_datasets: https://huggingface.co/datasets/RUC-NLPIR/FlashRAG_datasets
- verl: https://github.com/volcengine/verl
- LLaMA-Factory: https://github.com/hiyouga/LLaMA-Factory
- EasyRL: https://github.com/alibaba/EasyReinforcementLearning