A sophisticated chat application that combines multiple AI agents for flight search, news retrieval, and company financial document analysis. Built with LangGraph, Streamlit, and leveraging advanced RAG (Retrieval Augmented Generation) techniques.
- Flight Search: Real-time flight information using Amadeus API
- News Search: Current events and news via Tavily Search API
- Company Financials: Advanced RAG-based search through company documents
The system implements a sophisticated multi-stage retrieval pipeline:
- Uses Chroma as the vector database
- Implements persistent storage for embeddings
- Automatic PDF processing and chunking
- Document metadata preservation for source tracking
- Ensemble Retrieval:
- BM25 (Keyword-based search) - 20%
- Vector Similarity Search - 40%
- MMR (Maximal Marginal Relevance) - 40%
- Contextual Compression:
- Cohere re-ranking for result refinement
- Top-N filtering for most relevant results
- Citation preservation and source tracking
- Python 3.12+
- Poetry for dependency management
- Required API Keys:
- OpenAI API
- Tavily Search API
- Cohere API
- Amadeus API
- Clone the repository
- Install dependencies:
poetry install
- Copy
.env.example
to.env
and fill in your API keys:OPENAI_API_KEY=your_key TAVILY_API_KEY=your_key CO_API_KEY=your_key AMADEUS_CLIENT_ID=your_id AMADEUS_CLIENT_SECRET=your_secret
βββ company_pdfs/ # Place your company PDFs here
βββ chroma_db/ # Persistent vector store
βββ tools/
β βββ company_search.py
β βββ flight_search.py
β βββ tavily_search.py
βββ chat.py # Main application
-
Start the application:
poetry run streamlit run chat.py
-
Example queries:
- Flight Search: "Show me flights from Bangalore to Tokyo"
- News Search: "What's happening in AI technology today?"
- Company Search: "What was Rakuten's revenue in Q3 2023?"
- Recursive character text splitting with 1000-character chunks
- 200-character overlap between chunks
- Metadata preservation for source tracking
-
Initial retrieval using ensemble method:
- BM25 for keyword matching
- Dense vector similarity search
- MMR for diversity in results
-
Re-ranking:
- Cohere's rerank-english-v3.0 model
- Context-aware compression
- Top-5 most relevant results
- Automatic inline citations [1], [2], etc.
- Source metadata tracking (file name, page number)
- Expandable source references in UI
- Fork the repository
- Create your feature branch
- Commit your changes
- Push to the branch
- Create a new Pull Request
MIT License
- Nodes are individual processing units in the application flow, similar to functions in a workflow.
- Each node (like company_search_node, flight_search_node) handles a specific task and maintains its own state.
- Example: The company search node processes financial document queries while the flight search node handles travel requests.
- Tools are specialized classes that wrap external APIs or services, making them easily usable within nodes.
- Each tool (CompanySearchTool, FlightSearchTool, TavilySearchTool) follows a standard interface with invoke() and _run() methods.
class FlightSearchTool(BaseTool):
name: str = "flight_search"
def _run(self, input_str: str) -> str:
# Process flight search logic
- The application uses multiple nodes that work together, each handling different types of queries.
- State is passed between nodes using a dictionary structure that includes:
- Messages: Chat history and responses
- Sources: Retrieved document references
- Other metadata needed across nodes
return {
"messages": past_messages + [new_message],
"sources": current_sources,
}
- The application uses conditional routing to direct queries to appropriate nodes based on content.
- Routing decisions are made using:
- Message content analysis
- User intent detection
- Tool availability
def should_use_company_search(state):
# Route to company search if query is about financial data
return "financial" in state.query.lower()
- User input β Router
- Router β Appropriate Tool/Node
- Node processes request
- Response formatted with citations
- State updated and returned to user
This architecture allows for:
- Modular addition of new capabilities
- Clear separation of concerns
- Maintainable state management
- Flexible routing based on user needs
- A sophisticated keyword matching algorithm that improves upon traditional word-frequency matching. Think of it as "Ctrl+F" on steroids.
- It's smart enough to understand that if a word appears 10 times in a short document, it's probably more relevant than if it appears 10 times in a very long document. It also prevents common words from dominating the results.
- Converts words and sentences into numerical representations (vectors) where similar meanings are close to each other in mathematical space.
- For example, "automobile" and "car" would be close together in this space, allowing the system to find relevant content even when exact keywords don't match.
- Ensures search results aren't repetitive by balancing relevance with diversity.
- If you have 10 very similar documents that match a query, MMR will pick the best one and then look for other relevant but different perspectives, rather than showing you the same information 10 times.
- A specialized model that takes the initial search results and re-orders them by actually reading and understanding both the query and the content.
- Similar to having a human assistant who reads through search results and puts the most relevant ones at the top, but automated.
- Takes the re-ranked results and keeps only the most relevant ones (typically the top 5).
- This is like having an executive summary instead of a full report, ensuring the AI only works with the most important information.
- Keeps track of where each piece of information came from in your documents, including file names and page numbers.
- Works like a reference manager in a Word document, automatically maintaining source information and allowing for proper attribution in responses.