An intelligent agentic workflow that analyzes Google Scholar profiles, discovers research trends, recommends papers and conferences, and generates comprehensive personalized reports.
- Overview
- Key Features
- Architecture
- Getting Started
- Environment Configuration
- Usage
- File Descriptions
- How It Works
- Troubleshooting
- Future Enhancements
The Research Recommender AI Agent automates the tedious process of staying updated in research by:
- Analyzing your Google Scholar profile to extract research interests, citation metrics, and collaboration networks
- Discovering latest research news from trusted sources in your domains
- Recommending recent papers published by other researchers in your field
- Finding upcoming conferences (IEEE, ACM, and academic) relevant to your work
- Generating AI-powered insights including collaboration opportunities and emerging trends
- Creating professional reports in DOCX/PDF format with all findings
- Providing an interactive chat interface to explore recommendations
Perfect for researchers, PhD students, and academics who want to:
- Stay updated with minimal effort
- Discover collaboration opportunities
- Find relevant conferences to submit papers
- Track emerging trends in their field
- LangGraph-based state machine for reliable multi-step execution
- Error-resilient pipeline that continues even if individual steps fail
- Caching to avoid redundant API calls (3600s TTL)
- Extracts research interests from profile and publication titles
- Computes citation metrics (total citations)
- Identifies collaborators with affiliations
- Lists all publications sorted by year
- Fetches latest news from research domains using Tavily Search
- Discovers recent papers (last 4 years) excluding author's own work
- Groups papers by research topic for easy browsing
- Finds upcoming conferences (IEEE, ACM, academic)
- Filters by research interests and past conference history
- Provides CFP links, dates, locations, and countdown timers
- Collaboration suggestions based on research interests and network
- Trend analysis identifying emerging methodologies and hot topics
- Uses Google Gemini 2.0 Flash for intelligent summarization
- Generates DOCX reports with embedded hyperlinks
- Optional PDF export (Windows only, requires Word)
- Markdown-to-DOCX conversion with proper formatting
- Includes all tabs: Profile, News, Papers, Conferences, Insights
- Streamlit-based web interface for conversational exploration
- Real-time data fetching with progress indicators
- Tabbed interface for organized information viewing
- Email reports directly from the UI
┌─────────────────────────────────────────────────────────────┐
│ USER INPUT │
│ (Google Scholar URL or Name) │
└─────────────────────┬───────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ LANGGRAPH WORKFLOW │
│ ┌──────────────────────────────────────────────────────┐ │
│ │ 1. Fetch Profile (SerpAPI) │ │
│ │ ├─ Extract interests from papers (Gemini) │ │
│ │ ├─ Get citation metrics │ │
│ │ ├─ Extract collaborators │ │
│ │ └─ Get all publications │ │
│ └──────────────────────────────────────────────────────┘ │
│ ▼ │
│ ┌──────────────────────────────────────────────────────┐ │
│ │ 2. Fetch News (Tavily Search) │ │
│ │ ├─ Search for latest research news │ │
│ │ └─ Summarize with Gemini │ │
│ └──────────────────────────────────────────────────────┘ │
│ ▼ │
│ ┌──────────────────────────────────────────────────────┐ │
│ │ 3. Fetch Papers (SerpAPI Google Scholar) │ │
│ │ ├─ Search by research topics │ │
│ │ ├─ Filter by year (last 4 years) │ │
│ │ ├─ Exclude author's papers │ │
│ │ └─ Create tabular summaries (Gemini) │ │
│ └──────────────────────────────────────────────────────┘ │
│ ▼ │
│ ┌──────────────────────────────────────────────────────┐ │
│ │ 4. Fetch Conferences (Multi-source) │ │
│ │ ├─ Search IEEE conferences (SerpAPI + Tavily) │ │
│ │ ├─ Search academic conferences (Tavily + Gemini) │ │
│ │ ├─ Extract dates and locations │ │
│ │ └─ Filter by relevance and timeline │ │
│ └──────────────────────────────────────────────────────┘ │
│ ▼ │
│ ┌──────────────────────────────────────────────────────┐ │
│ │ 5. Generate Insights (Gemini) │ │
│ │ ├─ Collaboration suggestions │ │
│ │ └─ Research trends analysis │ │
│ └──────────────────────────────────────────────────────┘ │
└─────────────────────┬───────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ REPORT GENERATION │
│ ┌──────────────────────────────────────────────────────┐ │
│ │ - Parse Markdown to DOCX │ │
│ │ - Add hyperlinks, tables, formatting │ │
│ │ - Optional PDF export (via docx2pdf) │ │
│ │ - Email delivery support │ │
│ └──────────────────────────────────────────────────────┘ │
└─────────────────────┬───────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ OUTPUT DELIVERY │
│ CLI: File paths printed │
│ Streamlit: Download buttons + Email option │
└─────────────────────────────────────────────────────────────┘
Technology Stack:
- LangGraph: Workflow orchestration
- Google Gemini 2.0 Flash: LLM for summarization and analysis
- SerpAPI: Google Scholar scraping
- Tavily: Web search for news and conferences
- python-docx: Report generation
- Streamlit: Web UI
- BeautifulSoup: HTML parsing for conference scraping
- Python 3.10 or higher
- API Keys (see Environment Configuration)
-
Clone the repository
git clone <repository-url> cd Research-Recommender-AIagent
-
Create virtual environment
python -m venv .venv
-
Activate virtual environment
- Windows (PowerShell):
.venv\Scripts\Activate.ps1
- Windows (CMD):
.venv\Scripts\activate.bat
- Linux/Mac:
source .venv/bin/activate
- Windows (PowerShell):
-
Install dependencies
pip install -r requirements.txt
-
Set up environment variables (see next section)
Create a .env
file in the project root:
# Required API Keys
GOOGLE_API_KEY=your_gemini_api_key_here
SERPAPI_API_KEY=your_serpapi_key_here
TAVILY_API_KEY=your_tavily_key_here
# Optional: Model Configuration
GEMINI_MODEL=gemini-2.0-flash-exp
# Optional: Cache Settings
CACHE_ENABLED=true
CACHE_TTL=3600
# Optional: Performance
MAX_CONCURRENT=5
REQUEST_TIMEOUT=30
# Optional: Email Configuration (for Streamlit email feature)
EMAIL_FROM=your_email@gmail.com
EMAIL_PASSWORD=your_app_password
SMTP_SERVER=smtp.gmail.com
SMTP_PORT=587
-
Google Gemini API Key (Free tier available)
- Visit: https://ai.google.dev/
- Create project and enable Gemini API
- Generate API key
-
SerpAPI Key (100 searches/month free)
- Visit: https://serpapi.com/
- Sign up and get your API key
- Free tier: 100 searches/month
-
Tavily API Key (1000 searches/month free)
- Visit: https://tavily.com/
- Sign up for free account
- Get API key from dashboard
-
Email Configuration (Optional, for report delivery)
- For Gmail: Generate App Password from https://myaccount.google.com/apppasswords
- Don't use your regular Gmail password
Basic usage:
python -m src.cli --scholar-url "https://scholar.google.com/citations?user=XXXX"
With all options:
python -m src.cli \
--scholar-url "https://scholar.google.com/citations?user=XXXX" \
--out "output/my_report.docx" \
--pdf \
--max-papers 15 \
--max-conferences 20 \
--include-conferences \
--verbose
CLI Arguments:
--scholar-url
: Google Scholar URL or author name (required)--out
: Output file path (default:out/report.docx
)--pdf
: Also export as PDF--max-papers
: Papers per topic (default: 20)--max-conferences
: Max conferences to fetch (default: 10)--include-conferences
: Include conferences in report--verbose
: Show detailed progress
-
Start the Streamlit app
streamlit run app/chat.py
-
Open the browser (usually http://localhost:8501)
-
Enter Google Scholar URL in the sidebar
-
Click "Fetch Profile" to load researcher data
-
Use the buttons to fetch news, papers, conferences
-
Chat with the agent in the main tab
-
Generate report and download or email it
UI Features:
- Profile Tab: View complete researcher profile, metrics, collaborators
- News Tab: Latest research news summaries
- Papers Tab: Recommended papers organized by topic
- Conferences Tab: Upcoming conference listings
- Insights Tab: AI-generated collaboration and trend analysis
- Chat Tab: Ask questions about research, get recommendations
- Support for arXiv, PubMed, ACM Digital Library
- Multi-author comparison reports
- Citation network visualization
- Automated email scheduling (weekly/monthly digests)
- Export to LaTeX/HTML
- Integration with reference managers (Zotero, Mendeley)