You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
DeepResearchAgent is a hierarchical multi-agent system designed not only for deep research tasks but also for general-purpose task solving. The framework leverages a top-level planning agent to coordinate multiple specialized lower-level agents, enabling automated task decomposition and efficient execution across diverse and complex domains.
@@ -13,135 +15,203 @@ DeepResearchAgent is a hierarchical multi-agent system designed not only for dee
13
15
The system adopts a two-layer structure:
14
16
15
17
### 1. Top-Level Planning Agent
16
-
- Responsible for understanding, decomposing, and planning the overall workflow for a given task.
17
-
- Breaks down tasks into manageable sub-tasks and assigns them to appropriate lower-level agents.
18
-
- Dynamically coordinates the collaboration among agents to ensure smooth task completion.
18
+
19
+
* Responsible for understanding, decomposing, and planning the overall workflow for a given task.
20
+
* Breaks down tasks into manageable sub-tasks and assigns them to appropriate lower-level agents.
21
+
* Dynamically coordinates the collaboration among agents to ensure smooth task completion.
19
22
20
23
### 2. Specialized Lower-Level Agents
21
-
-**Deep Analyzer**
22
-
- Performs in-depth analysis of input information, extracting key insights and potential requirements.
23
-
- Supports analysis of various data types, including text and structured data.
24
-
-**Deep Researcher**
25
-
- Conducts thorough research on specified topics or questions, retrieving and synthesizing high-quality information.
26
-
- Capable of generating research reports or knowledge summaries automatically.
27
-
-**Browser Use**
28
-
- Automates browser operations, supporting web search, information extraction, and data collection tasks.
29
-
- Assists the Deep Researcher in acquiring up-to-date information from the internet.
24
+
25
+
***Deep Analyzer**
26
+
27
+
* Performs in-depth analysis of input information, extracting key insights and potential requirements.
28
+
* Supports analysis of various data types, including text and structured data.
29
+
***Deep Researcher**
30
+
31
+
* Conducts thorough research on specified topics or questions, retrieving and synthesizing high-quality information.
32
+
* Capable of generating research reports or knowledge summaries automatically.
33
+
***Browser Use**
34
+
35
+
* Automates browser operations, supporting web search, information extraction, and data collection tasks.
36
+
* Assists the Deep Researcher in acquiring up-to-date information from the internet.
30
37
31
38
## Features
39
+
32
40
- Hierarchical agent collaboration for complex and dynamic task scenarios
- Automated information analysis, research, and web interaction capabilities
35
43
- Secure Python code execution environment for tools, featuring configurable import controls, restricted built-ins, attribute access limitations, and resource limits. (See [PythonInterpreterTool Sandboxing](./docs/python_interpreter_sandbox.md) for details).
36
44
37
45
38
46
## Updates
39
-
* 2025.06.01
40
-
- Update the browser-use to 0.1.48.
41
-
* 2025.05.30
42
-
- Convert the sub agent to a function call, so that the planning agent can call the sub agents directly. Planning agent can now be gpt-4.1 or gemini-2.5-pro.
43
-
* 2025.05.27
44
-
- Updated the available remote API calls to support OpenAI, Anthropic, and Google LLMs.
45
-
- Added support for local Qwen models (via vllm, compatible with OpenAI API format, see details at the end of README)
47
+
48
+
***2025.06.01**: Update the browser-use to 0.1.48.
49
+
***2025.05.30**: Convert the sub agent to a function call. Planning agent can now be gpt-4.1 or gemini-2.5-pro.
50
+
***2025.05.27**: Support OpenAI, Anthropic, Google LLMs, and local Qwen models (via vLLM, see details in [Usage](#usage)).
46
51
47
52
## TODO List
48
-
-[x] Asynchronous feature completed
49
-
-[ ] Image Generation Agent to be developed
50
-
-[ ] MCP in progress
51
-
-[ ] AI4Research Agent to be developed
52
-
-[ ] Novel Writing Agent to be developed
53
+
54
+
*[x] Asynchronous feature completed
55
+
*[ ] Image Generation Agent to be developed
56
+
*[ ] MCP in progress
57
+
*[ ] AI4Research Agent to be developed
58
+
*[ ] Novel Writing Agent to be developed
53
59
54
60
## Installation
55
61
56
62
### Prepare Environment
57
-
```
63
+
64
+
```bash
58
65
# poetry install environment
59
66
conda create -n dra python=3.11
60
67
conda activate dra
61
68
make install
62
69
63
-
# (Optional) You can also use requirements.txt to setup the environment
70
+
# (Optional) You can also use requirements.txt
64
71
conda create -n dra python=3.11
65
72
conda activate dra
66
73
make install-requirements
67
74
68
-
# If you encounter any issues with Playwright during installation, you can install it manually:
2. Get `application_default_credentials.json`. Here is the reference: https://cloud.google.com/docs/authentication/application-default-credentials?hl=zh-cn
94
-
# Creating a Google API key requires it to be linked to a project, but the project may also need Vertex AI authorization, so it is necessary to obtain the appropriate credentials.
Function-calling is now supported natively by GPT-4.1 / Gemini 2.5 Pro. Claude-3.7-Sonnet is also recommended.
197
+
124
198
## Acknowledgement
125
-
DeepResearchAgent is primarily inspired by the architecture of smolagents. The following improvements have been made:
126
-
- The codebase of smolagents has been modularized for better structure and organization.
127
-
- The original synchronous framework has been refactored into an asynchronous one.
128
-
- The multi-agent setup process has been optimized to make it more user-friendly and efficient.
129
199
130
-
We would like to express our gratitude to the following open source projects, which have greatly contributed to the development of this work:
131
-
-[smolagents](https://github.com/huggingface/smolagents) - A lightweight agent framework.
132
-
-[OpenManus](https://github.com/mannaandpoem/OpenManus) - An asynchronous agent framework.
133
-
-[browser-use](https://github.com/browser-use/browser-use) - An AI-powered browser automation tool.
134
-
-[crawl4ai](https://github.com/unclecode/crawl4ai) - A web crawling library for AI applications.
135
-
-[markitdown](https://github.com/microsoft/markitdown) - A tool for converting files to Markdown format.
200
+
DeepResearchAgent is inspired by and improved upon:
136
201
137
-
We sincerely appreciate the efforts of all contributors and maintainers of these projects for their commitment to advancing AI technologies and making them available to the wider community.
I’ve found that both OpenAI and Google models are strictly trained for function calling, which means they no longer use JSON outputs to invoke sub-agents. Therefore, I recommend using Claude-3.7-Sonnet as the planning agent whenever possible.
174
-
This issue has been fixed. The planning agent can now call the sub-agents directly, so you can use gpt-4.1 or gemini-2.5-pro as the planning agent.
0 commit comments