Welcome to Agentic RAG, an intelligent retrieval system that doesn’t just fetch data—it thinks before retrieving. Unlike traditional RAG, which passively pulls from a vector database, Agentic RAG dynamically decides whether to:
✅ Retrieve stored knowledge (Qdrant)
✅ Perform a real-time web search (Tavily)
This prevents outdated responses, reduces hallucinations, and improves efficiency for LLM-powered applications.
- 🔍 Hybrid Retrieval – Chooses between stored embeddings & live web search dynamically.
- 🧠 Decision-Making AI Agent – Knows when to use Qdrant vs. search the web.
- ⚡ Faster & Smarter – Reduces unnecessary API calls, optimizing response time & accuracy.
- 🛠️ Prevents AI Hallucinations – Avoids guessing by fetching fresh data when needed.
- 📓 Hands-on Colab Notebook – Try it out without setup hassles.
pip install --upgrade langchain langchain_community pypdf langchain-huggingface google-generativeai qdrant-client
git clone https://github.com/devjothish/AgenticRAG.git
cd AgenticRAG
To run Agentic RAG, you need API keys for Qdrant (vector search) and Tavily (web search).
import os
os.environ["QDRANT_API_KEY"] = "your-qdrant-api-key"
os.environ["TAVILY_API_KEY"] = "your-tavily-api-key"
Qdrant stores documents in vector format for fast similarity-based retrieval.
from qdrant_client import QdrantClient
qdrant = QdrantClient(
url="https://your-qdrant-instance.com",
api_key=os.getenv("QDRANT_API_KEY")
)
The agent chooses between vector search (Qdrant) and live web search (Tavily).
from langchain.tools import Tool
# Qdrant Vector Search Tool
vector_search_tool = Tool(
name="VectorStore",
func=lambda query: vectorstore.similarity_search(query, k=3),
description="Use this tool when the query is related to stored documents."
)
# Tavily Web Search Tool
web_search_tool = Tool(
name="WebSearch",
func=web_search,
description="Use this tool when the query requires fresh or real-time information."
)
tools = [vector_search_tool, web_search_tool]
The agent decides dynamically whether to retrieve from Qdrant or use Tavily for fresh information.
from langchain.agents import AgentExecutor
agent_executor = AgentExecutor(
agent=chain, # The LLM-powered decision-maker
tools=tools, # Vector search & web search tools
handle_parsing_errors=True,
verbose=True
)
query1 = "What is Agentic RAG?"
response1 = agent_executor.invoke({"input": query1})
print(response1)
query2 = "Who won the latest NBA game?"
response2 = agent_executor.invoke({"input": query2})
print(response2)
🔹 If the query matches stored knowledge → The agent retrieves from Qdrant.
🔹 If the query requires real-time data → The agent searches the web using Tavily.
🟢 Try the hands-on Colab Notebook → Click Here
👨💻 Jothiswaran Arumugam
🔗 GitHub: github.com/devjothish
💬 Twitter: @devjothish