Skip to content

jh941213/TTD-DR

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

1 Commit
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

๐Ÿ”ฌ TTD-DR: Test-Time Diffusion Deep Researcher

Advanced AI-powered research system using diffusion-based iterative refinement

๐Ÿ“‹ Overview

TTD-DR (Test-Time Diffusion Deep Researcher) is an innovative research system that applies diffusion-based algorithms to generate comprehensive, high-quality research reports. Unlike traditional retrieval-augmented generation (RAG) systems, TTD-DR uses a draft-centric approach where an evolving draft dynamically guides the research process through multiple iterations.

๐ŸŽฏ Key Features

  • ๐Ÿ”„ Iterative Draft Refinement: Starts with a "noisy" initial draft and progressively refines it
  • ๐ŸŽฏ Draft-Centric Search: The evolving draft guides what information to search for next
  • ๐Ÿ” Multi-Engine Search: Integrates Tavily, DuckDuckGo, and Naver search engines
  • ๐Ÿง  Gap Analysis: Automatically identifies knowledge gaps and fills them systematically
  • โš–๏ธ Quality Evaluation: Continuous assessment of research completeness and quality
  • ๐ŸŒ Multi-Language Support: Works with English, Korean, and other languages
  • ๐Ÿš€ Async Support: Built with modern async/await patterns for optimal performance

๐Ÿงฎ Algorithm Highlights

The system implements the Denoising with Retrieval (Draft-Centric Approach) algorithm:

  1. Initialize: Generate a noisy initial draft Rโ‚€
  2. Analyze: Identify gaps in the current draft
  3. Search: Query multiple search engines to fill identified gaps
  4. Denoise: Update the draft with new information
  5. Evaluate: Assess quality and determine if more iterations are needed
  6. Iterate: Repeat until quality threshold is met or max iterations reached

๐Ÿš€ Quick Start

1. Installation

# Clone the repository
git clone <repository-url>
cd ttd-dr

# Install dependencies
pip install -r requirements.txt

2. Environment Setup

Copy the example environment file and configure your API keys:

cp .env.example .env

Edit .env with your API keys:

# Required: Choose one
AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com/
AZURE_OPENAI_API_KEY=your-azure-openai-api-key
AZURE_OPENAI_DEPLOYMENT=gpt-4o-mini

# OR
OPENAI_API_KEY=sk-your-openai-api-key-here

# Optional but recommended
TAVILY_API_KEY=tvly-your-tavily-api-key
NAVER_CLIENT_ID=your-naver-client-id
NAVER_CLIENT_SECRET=your-naver-client-secret

3. Run the Simple Chatbot

The easiest way to get started is with our simple chatbot interface:

python chatbot.py

Example interaction:

๐Ÿค– TTD-DR Deep Research Chatbot
============================================================
Welcome to the Test-Time Diffusion Deep Researcher!
This chatbot can conduct in-depth research on any topic.

๐Ÿค” Your research question: What are the latest developments in artificial intelligence in 2024?

๐Ÿ” Research Query: What are the latest developments in artificial intelligence in 2024?
โณ Starting deep research... (this may take a few minutes)
๐Ÿ“Š The system will show progress updates during research
------------------------------------------------------------

[Research process with real-time updates...]

============================================================
๐Ÿ“‹ RESEARCH REPORT COMPLETED
============================================================
๐Ÿ“„ Report Length: 3,247 characters
๐Ÿ”„ Iterations: 3
๐Ÿ“š Sources Used: 12
โฑ๏ธ  Execution Time: 87.3 seconds
๐ŸŽฏ Status: completed

๐Ÿ“– RESEARCH REPORT:
------------------------------
# Latest Developments in Artificial Intelligence (2024)

## Executive Summary
The year 2024 has marked significant advances in artificial intelligence...

[Comprehensive research report continues...]

4. Quick Example Mode

For a quick demonstration:

python chatbot.py --example

๐Ÿ“š Usage Examples

Interactive Chatbot Commands

Command Description
help Show welcome message and instructions
status Check system status and API configuration
examples Display example research queries
quit / exit Exit the chatbot

Example Research Queries

  • Technology: "What are the latest developments in artificial intelligence in 2024?"
  • Science: "How does climate change affect global food security?"
  • Comparison: "What are the key differences between quantum and classical computing?"
  • Analysis: "What are the ethical implications of genetic engineering?"
  • Current Events: "Describe recent advances in space exploration technology"

API Usage

For programmatic access:

import asyncio
from langgraph_ttd_dr.interface import TTDResearcher
from langgraph_ttd_dr.client_factory import create_openai_client

async def research_example():
    # Create client and researcher
    client = create_openai_client()
    researcher = TTDResearcher(
        client=client,
        max_iterations=5,
        max_sources=15
    )
    
    # Conduct research
    report, metadata = await researcher.research(
        "What is the current state of renewable energy technology?"
    )
    
    print(f"Research completed with {len(metadata['all_sources'])} sources")
    print(f"Iterations: {metadata['iterations']}")
    print(f"Report: {report}")

# Run the example
asyncio.run(research_example())

๐Ÿ—๏ธ Architecture

Core Components

๐Ÿ“ฆ langgraph_ttd_dr/
โ”œโ”€โ”€ ๐ŸŽ›๏ธ interface.py          # Main TTDResearcher class
โ”œโ”€โ”€ ๐Ÿ”— client_factory.py     # OpenAI/Azure client management
โ”œโ”€โ”€ ๐Ÿ“Š state.py              # Research state management
โ”œโ”€โ”€ ๐Ÿ”„ workflow.py           # LangGraph workflow definition
โ”œโ”€โ”€ ๐Ÿงฉ nodes.py              # Individual workflow nodes
โ”œโ”€โ”€ ๐Ÿ’ฌ prompts.py            # Centralized prompt management
โ”œโ”€โ”€ ๐Ÿ” tools.py              # Web search tools
โ””โ”€โ”€ ๐Ÿ› ๏ธ utils.py              # Utility functions

๐Ÿ“„ chatbot.py                # Simple usage example
๐Ÿ“„ interactive_chatbot.py    # Advanced interactive interface

Workflow Nodes

  1. QueryClarificationNode: Improves and clarifies the research question
  2. PlannerNode: Creates a structured research plan
  3. NoisyDraftGeneratorNode: Generates the initial draft Rโ‚€
  4. DraftBasedQuestionGeneratorNode: Identifies gaps and generates search queries
  5. SearchAgentNode: Executes multi-engine web searches
  6. DenoisingUpdaterNode: Updates the draft with new information
  7. IterationControllerNode: Decides whether to continue or finalize
  8. ReportGeneratorNode: Produces the final research report

Search Engines

  • ๐Ÿ” Tavily: High-quality, research-focused search results
  • ๐Ÿฆ† DuckDuckGo: Privacy-focused web search (no API key required)
  • ๐Ÿ” Naver: Korean and Asian content specialist

๐Ÿ“Š Configuration Options

Research Parameters

Parameter Default Description
max_iterations 5 Maximum research iterations
max_sources 15 Maximum sources to collect
search_results_per_gap 3 Results per knowledge gap
recursion_limit 50 LangGraph recursion limit

Quality Metrics

TTD-DR tracks multiple quality dimensions:

  • Completeness: How thoroughly the topic is covered
  • Accuracy: Factual correctness of information
  • Relevance: How well content matches the query
  • Coherence: Logical flow and organization
  • Citation Quality: Source reliability and diversity

๐Ÿ”ง System Requirements

  • Python: 3.8 or higher
  • Dependencies: See requirements.txt
  • APIs: OpenAI or Azure OpenAI (required), search APIs (optional)
  • Memory: 2GB+ RAM recommended for complex research

โš™๏ธ Advanced Configuration

Custom Search Engines

researcher = TTDResearcher(
    client=client,
    search_engines=['tavily', 'duckduckgo'],  # Customize search engines
    search_results_per_gap=5,                 # More results per gap
    max_iterations=10                         # Longer research
)

Custom Prompts

You can customize the research behavior by modifying prompts in langgraph_ttd_dr/prompts.py.

๐Ÿšจ Troubleshooting

Common Issues

  1. API Key Errors

    โŒ Failed to create client: No API key found

    Solution: Check your .env file and ensure API keys are correctly set.

  2. Search Failures

    โŒ All search engines failed

    Solution: Verify search API keys or rely on DuckDuckGo (no key required).

  3. Long Processing Times

    • Reduce max_iterations or max_sources
    • Use faster models (e.g., gpt-4o-mini instead of gpt-4o)

Debug Mode

Enable debug logging:

export DEBUG=true
python chatbot.py

๐Ÿค Contributing

We welcome contributions! Please see our contributing guidelines for details on:

  • Code style and standards
  • Testing requirements
  • Pull request process
  • Issue reporting

๐Ÿ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

๐Ÿ™ Acknowledgments

  • Original Research: Based on "Deep Researcher with Test-Time Diffusion" by Han et al. (2025)
  • LangGraph: For the excellent workflow framework
  • OpenAI/Azure: For powerful language models
  • Search Providers: Tavily, DuckDuckGo, and Naver for search capabilities
  • Research Community: For insights into diffusion-based approaches
  • OptILLM Project: Referenced the deep research plugin for research engine architecture insights

๐Ÿ“ž Support

  • Issues: Report bugs and feature requests via GitHub Issues
  • Discussions: Join community discussions in GitHub Discussions
  • Documentation: See the /docs folder for detailed documentation

๐Ÿš€ Ready to conduct deep research? Start with python chatbot.py and explore the power of TTD-DR!

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages