Skip to content

Commit 454ce28

Browse files
authored
Merge pull request #26 from ZV-Liu/main
Add support for local vLLM inference and Chinese README documentation
2 parents 58c4e47 + b773862 commit 454ce28

File tree

8 files changed

+550
-90
lines changed

8 files changed

+550
-90
lines changed

README.md

Lines changed: 133 additions & 80 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,7 @@
11
# DeepResearchAgent
22

3+
English | [简体中文](README_CN.md)
4+
35
## Introduction
46

57
DeepResearchAgent is a hierarchical multi-agent system designed not only for deep research tasks but also for general-purpose task solving. The framework leverages a top-level planning agent to coordinate multiple specialized lower-level agents, enabling automated task decomposition and efficient execution across diverse and complex domains.
@@ -13,135 +15,203 @@ DeepResearchAgent is a hierarchical multi-agent system designed not only for dee
1315
The system adopts a two-layer structure:
1416

1517
### 1. Top-Level Planning Agent
16-
- Responsible for understanding, decomposing, and planning the overall workflow for a given task.
17-
- Breaks down tasks into manageable sub-tasks and assigns them to appropriate lower-level agents.
18-
- Dynamically coordinates the collaboration among agents to ensure smooth task completion.
18+
19+
* Responsible for understanding, decomposing, and planning the overall workflow for a given task.
20+
* Breaks down tasks into manageable sub-tasks and assigns them to appropriate lower-level agents.
21+
* Dynamically coordinates the collaboration among agents to ensure smooth task completion.
1922

2023
### 2. Specialized Lower-Level Agents
21-
- **Deep Analyzer**
22-
- Performs in-depth analysis of input information, extracting key insights and potential requirements.
23-
- Supports analysis of various data types, including text and structured data.
24-
- **Deep Researcher**
25-
- Conducts thorough research on specified topics or questions, retrieving and synthesizing high-quality information.
26-
- Capable of generating research reports or knowledge summaries automatically.
27-
- **Browser Use**
28-
- Automates browser operations, supporting web search, information extraction, and data collection tasks.
29-
- Assists the Deep Researcher in acquiring up-to-date information from the internet.
24+
25+
* **Deep Analyzer**
26+
27+
* Performs in-depth analysis of input information, extracting key insights and potential requirements.
28+
* Supports analysis of various data types, including text and structured data.
29+
* **Deep Researcher**
30+
31+
* Conducts thorough research on specified topics or questions, retrieving and synthesizing high-quality information.
32+
* Capable of generating research reports or knowledge summaries automatically.
33+
* **Browser Use**
34+
35+
* Automates browser operations, supporting web search, information extraction, and data collection tasks.
36+
* Assists the Deep Researcher in acquiring up-to-date information from the internet.
3037

3138
## Features
39+
3240
- Hierarchical agent collaboration for complex and dynamic task scenarios
3341
- Extensible agent system, allowing easy integration of additional specialized agents
3442
- Automated information analysis, research, and web interaction capabilities
3543
- Secure Python code execution environment for tools, featuring configurable import controls, restricted built-ins, attribute access limitations, and resource limits. (See [PythonInterpreterTool Sandboxing](./docs/python_interpreter_sandbox.md) for details).
3644

3745

3846
## Updates
39-
* 2025.06.01
40-
- Update the browser-use to 0.1.48.
41-
* 2025.05.30
42-
- Convert the sub agent to a function call, so that the planning agent can call the sub agents directly. Planning agent can now be gpt-4.1 or gemini-2.5-pro.
43-
* 2025.05.27
44-
- Updated the available remote API calls to support OpenAI, Anthropic, and Google LLMs.
45-
- Added support for local Qwen models (via vllm, compatible with OpenAI API format, see details at the end of README)
47+
48+
* **2025.06.01**: Update the browser-use to 0.1.48.
49+
* **2025.05.30**: Convert the sub agent to a function call. Planning agent can now be gpt-4.1 or gemini-2.5-pro.
50+
* **2025.05.27**: Support OpenAI, Anthropic, Google LLMs, and local Qwen models (via vLLM, see details in [Usage](#usage)).
4651

4752
## TODO List
48-
- [x] Asynchronous feature completed
49-
- [ ] Image Generation Agent to be developed
50-
- [ ] MCP in progress
51-
- [ ] AI4Research Agent to be developed
52-
- [ ] Novel Writing Agent to be developed
53+
54+
* [x] Asynchronous feature completed
55+
* [ ] Image Generation Agent to be developed
56+
* [ ] MCP in progress
57+
* [ ] AI4Research Agent to be developed
58+
* [ ] Novel Writing Agent to be developed
5359

5460
## Installation
5561

5662
### Prepare Environment
57-
```
63+
64+
```bash
5865
# poetry install environment
5966
conda create -n dra python=3.11
6067
conda activate dra
6168
make install
6269

63-
# (Optional) You can also use requirements.txt to setup the environment
70+
# (Optional) You can also use requirements.txt
6471
conda create -n dra python=3.11
6572
conda activate dra
6673
make install-requirements
6774

68-
# If you encounter any issues with Playwright during installation, you can install it manually:
75+
# playwright install if needed
6976
pip install playwright
7077
playwright install chromium --with-deps --no-shell
7178
```
7279

73-
### Put `.env` in the root
80+
### Set Up `.env`
7481

75-
`.env` should be like
76-
```
77-
PYTHONWARNINGS=ignore # ignore warnings
78-
ANONYMIZED_TELEMETRY=false # disable telemetry
79-
HUGGINEFACE_API_KEY=abcabcabc # your huggingface api key
82+
```bash
83+
PYTHONWARNINGS=ignore
84+
ANONYMIZED_TELEMETRY=false
85+
HUGGINEFACE_API_KEY=abcabcabc
8086
OPENAI_API_BASE=https://api.openai.com/v1
81-
OPENAI_API_KEY=abcabcabc # your openai api key
87+
OPENAI_API_KEY=abcabcabc
8288
ANTHROPIC_API_BASE=https://api.anthropic.com
83-
ANTHROPIC_API_KEY=abcabcabc # your anthropic api key
89+
ANTHROPIC_API_KEY=abcabcabc
8490
GOOGLE_APPLICATION_CREDENTIALS=/your/user/path/.config/gcloud/application_default_credentials.json
8591
GOOGLE_API_BASE=https://generativelanguage.googleapis.com
86-
GOOGLE_API_KEY=abcabcabc # your google api key
92+
GOOGLE_API_KEY=abcabcabc
8793
```
8894

89-
```
90-
Note: Maybe you have some problems using google api, here is the reference
91-
1. Get api key from https://aistudio.google.com/app/apikey
95+
Refer to:
96+
97+
* [https://aistudio.google.com/app/apikey](https://aistudio.google.com/app/apikey)
98+
* [https://cloud.google.com/docs/authentication/application-default-credentials?hl=zh-cn](https://cloud.google.com/docs/authentication/application-default-credentials?hl=zh-cn)
9299

93-
2. Get `application_default_credentials.json`. Here is the reference: https://cloud.google.com/docs/authentication/application-default-credentials?hl=zh-cn
94-
# Creating a Google API key requires it to be linked to a project, but the project may also need Vertex AI authorization, so it is necessary to obtain the appropriate credentials.
100+
```bash
95101
brew install --cask google-cloud-sdk
96102
gcloud init
97103
gcloud auth application-default login
98104
```
99105

100106
## Usage
101107

102-
### Deep Researcher for "AI Agent" as an example
103-
```
108+
### Deep Researcher for "AI Agent"
109+
110+
```bash
104111
python examples/run_example.py
105112
```
106113

107-
### GAIA as an example
114+
### GAIA Evaluation Example
108115

109-
```
116+
```bash
110117
# Download GAIA
111-
mkdir data | cd data
118+
mkdir data && cd data
112119
git clone https://huggingface.co/datasets/gaia-benchmark/GAIA
113120

114-
# Run the script in the examples
121+
# Run
115122
python examples/run_gaia.py
116123
```
117124

125+
### Deploying Qwen Models via vLLM
126+
127+
#### Step 1: Launch the vLLM Inference Service
128+
129+
```bash
130+
nohup bash -c 'CUDA_VISIBLE_DEVICES=0,1 python -m vllm.entrypoints.openai.api_server \
131+
--model /input0/Qwen3-32B \
132+
--served-model-name Qwen \
133+
--host 0.0.0.0 \
134+
--port 8000 \
135+
--max-num-seqs 16 \
136+
--enable-auto-tool-choice \
137+
--tool-call-parser hermes \
138+
--tensor_parallel_size 2' > vllm_qwen.log 2>&1 &
139+
```
140+
141+
Update `.env`:
142+
143+
```bash
144+
QWEN_API_BASE=http://localhost:8000/v1
145+
QWEN_API_KEY="abc"
146+
```
147+
148+
#### Step 2: Launch the Agent Service
149+
150+
```bash
151+
python main.py
152+
```
153+
154+
Example command:
155+
156+
```bash
157+
Use deep_researcher_agent to search the latest papers on the topic of 'AI Agent' and then summarize it.
158+
```
159+
118160
## Experiments
161+
119162
We evaluated our agent on the GAIA validation set and achieved state-of-the-art performance on May 10th.
163+
120164
<p align="center">
121165
<img src="./docs/gaia_benchmark.png" alt="GAIA Example Result" width="700"/>
122166
</p>
123167

168+
## Questions
169+
170+
### 1. About Qwen Models
171+
172+
Our framework now supports:
173+
174+
* qwen2.5-7b-instruct
175+
* qwen2.5-14b-instruct
176+
* qwen2.5-32b-instruct
177+
178+
Update your config:
179+
180+
```toml
181+
model_id = "qwen2.5-7b-instruct"
182+
```
183+
184+
### 2. Browser Use
185+
186+
If problems occur, reinstall:
187+
188+
```bash
189+
pip install "browser-use[memory]"==0.1.48
190+
pip install playwright
191+
playwright install chromium --with-deps --no-shell
192+
```
193+
194+
### 3. Sub-Agent Calling
195+
196+
Function-calling is now supported natively by GPT-4.1 / Gemini 2.5 Pro. Claude-3.7-Sonnet is also recommended.
197+
124198
## Acknowledgement
125-
DeepResearchAgent is primarily inspired by the architecture of smolagents. The following improvements have been made:
126-
- The codebase of smolagents has been modularized for better structure and organization.
127-
- The original synchronous framework has been refactored into an asynchronous one.
128-
- The multi-agent setup process has been optimized to make it more user-friendly and efficient.
129199

130-
We would like to express our gratitude to the following open source projects, which have greatly contributed to the development of this work:
131-
- [smolagents](https://github.com/huggingface/smolagents) - A lightweight agent framework.
132-
- [OpenManus](https://github.com/mannaandpoem/OpenManus) - An asynchronous agent framework.
133-
- [browser-use](https://github.com/browser-use/browser-use) - An AI-powered browser automation tool.
134-
- [crawl4ai](https://github.com/unclecode/crawl4ai) - A web crawling library for AI applications.
135-
- [markitdown](https://github.com/microsoft/markitdown) - A tool for converting files to Markdown format.
200+
DeepResearchAgent is inspired by and improved upon:
136201

137-
We sincerely appreciate the efforts of all contributors and maintainers of these projects for their commitment to advancing AI technologies and making them available to the wider community.
202+
* [smolagents](https://github.com/huggingface/smolagents)
203+
* [OpenManus](https://github.com/mannaandpoem/OpenManus)
204+
* [browser-use](https://github.com/browser-use/browser-use)
205+
* [crawl4ai](https://github.com/unclecode/crawl4ai)
206+
* [markitdown](https://github.com/microsoft/markitdown)
138207

139208
## Contribution
140209

141-
Contributions and suggestions are welcome! Feel free to open issues or submit pull requests to improve the project.
210+
Contributions and suggestions are welcome! Feel free to open issues or submit pull requests.
142211

143212
## Cite
144-
```
213+
214+
```bibtex
145215
@misc{DeepResearchAgent,
146216
title = {`DeepResearchAgent`: A Hierarchical Multi-Agent Framework for General-purpose Task Solving.},
147217
author = {Wentao Zhang, Ce Cui, Yang Liu, Bo An},
@@ -150,25 +220,8 @@ Contributions and suggestions are welcome! Feel free to open issues or submit pu
150220
}
151221
```
152222

153-
## Questions
223+
---
154224

155-
### 1. About Qwen models
156-
Our framework now supports local Qwen models, including qwen2.5-7b-instruct, qwen2.5-14b-instruct, and qwen2.5-32b-instruct.
157-
```
158-
# Configure your config file to use qwen's model
159-
model_id = "qwen2.5-7b-instruct"
160-
```
161-
162-
### 2. About browser use
163-
If you are having problems with your browser, please reinstall the browser tool.
164-
```
165-
pip install "browser-use[memory]"==0.1.48
166-
167-
# install playwright
168-
pip install playwright
169-
playwright install chromium --with-deps --no-shell
170-
```
225+
### 🇨🇳 中文版说明文档
171226

172-
### 3. About calling for sub agents
173-
I’ve found that both OpenAI and Google models are strictly trained for function calling, which means they no longer use JSON outputs to invoke sub-agents. Therefore, I recommend using Claude-3.7-Sonnet as the planning agent whenever possible.
174-
This issue has been fixed. The planning agent can now call the sub-agents directly, so you can use gpt-4.1 or gemini-2.5-pro as the planning agent.
227+
如果你更习惯阅读中文说明文档,请查阅 [README.zh.md](./README.zh.md)

0 commit comments

Comments
 (0)