A Multi-step autonomous research agent. It takes a high-level user query, interactively refines the research scope, dynamically plans a report outline, gathers information from the web, and writes a comprehensive, cited report in Markdown
It uses Tavily API for web search/scraping and Gemini AI API for LLM.
Deployed on Hugging Spaces
The UI is barebones, but it works.
- Human-in-the-Loop: Starts by asking clarifying questions to narrow down the user's intent.
- Dynamic Outline Planning: Generates a structured report outline based on initial search results, then expands each section with key questions.
- Deep Research: Performs targeted, deep-dive searches for each section of the report.
- Retrieval-Augmented Generation (RAG): Chunks and embeds research content into a vector store (FAISS) to find the most relevant information for writing.
- Source Citation: Meticulously cites every factual statement, linking it back to the source URL.
- Context-Aware Writing: Keeps track of previously written sections to maintain flow and avoid repetition.
- PDF Export: Converts the final Markdown report into a high-quality, well-formatted PDF with a table of contents using Pandoc and LaTeX.
- Python 3.8+
- Pandoc: Required for converting the markdown report to PDF.
- A LaTeX distribution, such as MiKTeX (for Windows) or TeX Live (for macOS and Linux). This is required by Pandoc to create PDFs.
-
Clone the repository:
git clone https://github.com/rajput-musa/DeepResearch-Agent.git cd DeepResearch-Agent
-
Create a virtual environment and activate it:
python -m venv venv venv\Scripts\activate # On Windows # source venv/bin/activate # On macOS/Linux
-
Install the required Python packages:
pip install -r requirements.txt
-
Set up your environment variables:
Create a file named
.env
in the root of the project directory and add your API keys:GOOGLE_API_KEY="YOUR_GOOGLE_API_KEY" TAVILY_API_KEY="YOUR_TAVILY_API_KEY"
You can get your API keys from:
Once you have completed the installation and setup steps, you can run the application with the following command:
python app.py
This will start the Gradio web server. Open the provided URL in your browser to start using the Mini-DeepSearch-Agent.
- Initial Topic: You provide a research topic.
- Clarification: The agent asks clarifying questions to narrow down the scope and understand your requirements.
- Research & Report Generation: Based on your answers, the agent conducts research and generates a report section by section. You can see the progress in the UI.
- Download Report: Once the report is complete, a "Download Report as PDF" button will appear. Click it to download the report.
.
├── .env.example
├── app.py
├── research_agent
│ ├── agent.py
│ ├── config.py
│ ├── export.py
│ ├── prompts.py
│ ├── rag_pipeline.py
│ └── tools.py
├── LICENSE
└── requirements.txt