A Retrieval Augmented Generation (RAG) system with web search fallback capability that reflects on its knowledge state to decide when to use external resources.
Below is a demonstration of the application in action:
- Self-Reflection: The system evaluates whether its document knowledge is sufficient to answer a query before deciding to use web search
- Document Processing: Upload and process PDF documents for knowledge extraction
- Hybrid Search: Seamless integration between document-based knowledge and web search
- Source Attribution: Transparent display of information sources for each response
- Multi-Model Support: Compatible with various LLMs (Gemini Pro, GPT-4, Claude)
- Customizable Parameters: Adjust chunking, embedding, and other parameters to optimize performance
- Modern UI: Clean, intuitive interface with dark mode support
The system follows a modular architecture:
-
Document Processing
- Load PDF documents using PyPDFLoader
- Split text into chunks using RecursiveCharacterTextSplitter
- Generate embeddings using VertexAI or OpenAI
- Store in FAISS vector database
-
Query Processing Logic
- Retrieve relevant documents from vector store
- Evaluate if retrieved information is sufficient to answer the query
- If sufficient, generate answer from documents
- If insufficient, fall back to web search
- Combine document and web results when needed
-
Self-Reflection Layer
- LLM-based evaluation of knowledge sufficiency
- Decision-making for when to use external resources
- Transparent reasoning about information sources
self-reflective-rag/
├── app.py # Main Streamlit application
├── src/
│ ├── core/ # Core functionality
│ │ └── rag_system.py # RAG implementation with fallback
│ ├── utils/ # Utility functions
│ │ └── file_utils.py # File handling utilities
│ ├── components/ # UI components
│ │ └── ui_components.py # Reusable UI elements
│ └── config/ # Configuration
│ └── settings.py # Environment and settings management
├── data/ # Data directory
│ └── sample/ # Sample documents
├── static/ # Static files
├── docs/ # Documentation
├── requirements.txt # Project dependencies
└── .env.example # Example environment variables
-
Clone the repository
git clone https://github.com/yourusername/self-reflective-rag.git cd self-reflective-rag
-
Create a virtual environment
python -m venv venv source venv/bin/activate # On Windows, use: venv\Scripts\activate
-
Install the dependencies
pip install -r requirements.txt
-
Set up environment variables
cp .env.example .env
Edit
.env
with your API keys and configuration
-
Start the Streamlit application
streamlit run app.py
-
Open your browser and go to http://localhost:8501
-
Enter your API keys in the sidebar settings
-
Upload PDF documents in the Documents tab
-
Ask questions in the Chat tab
- Serper API Key: Required for web search (Get key)
- Google API Key: Required for Gemini models (Get key)
- OpenAI API Key: Optional, for GPT models and embeddings (Get key)
- Anthropic API Key: Optional, for Claude models (Get key)
- The system currently only supports PDF documents
- Long documents may require significant processing time
- API rate limits may apply depending on your API key plan
- Add support for more document formats (DOCX, TXT, etc.)
- Implement memory for conversations
- Add API endpoint for headless operation
- Improve document chunking strategies
- Add support for custom RAG pipelines
This project is licensed under the MIT License - see the LICENSE file for details.