Version-1
KIRA is an intelligent, multimodal educational assistant built using Flask, Langchain and advanced AI models, designed to provide precise and engaging learning support. Whether you're uploading PDFs for reference or seeking real-time answers through web scraping, AI Tutor adapts dynamically using Retrieval-Augmented Generation (RAG) and Agentic capabilities to ensure tailored responses.
Layer | Technologies Used |
---|---|
Frontend | Flask, HTML, CSS, JavaScript |
AI Models | Gemini 1.5-flash, Gemini 2.0-flash, mxbai-embed-large:latest |
Vector Datastore | FAISS (in-memory vector store) |
Web Scraping | Langchain Tools, BeautifulSoup4, SERP |
RAG + Agentic Support | Langchain, GoogleGenerativeAI, webbrowser |
NLP Processing | Ollama Embeddings, CharacterTextSplitter, NLTK |
Voice Interaction | SpeechRecognition, pyttsx3 (TTS) |
- With PDFs: Extracts content from uploaded PDFs, embeds them into FAISS using cosine similarity, and generates responses by augmenting model outputs.
- Without PDFs: Uses real-time web scraping and curates links from the web to answer queries dynamically.
- Differentiates between retrieval mode (document-based) and non-retrieval mode (web-based), ensuring the most context-aware and relevant responses.
- Retrieves embedded vector chunks using cosine similarity and enhances Gemini-generated outputs with this specific context.
- Automates browsing using web scraping tools.
- Opens recommended learning links in the user's browser for detailed study.
- Listens to queries using speech recognition.
- Responds using text-to-speech, improving accessibility and engagement.
- Chains multiple models during RAG processing:
Model Purpose shunya_llm Generation for non-retrieved queries using user input + web-scraped content. pratham_llm Retrieval-based topic generation using vector DB or document content. dviteey_llm Cross-verification of pratham_llm's output using retrieved data to reduce hallucinations and enhance clarity.
- Offers curated links or document-based responses with no redundancyβfocused only on what's essential for the user.
AI-TUTOR/
βββ aiFeatures/
β βββ python/
β βββ ai_assistant.py # Core logic to manage query routing and decision-making
β βββ ai_response.py # Handles AI model responses and augmentation
β βββ rag_pipeline.py # Implements RAG (retrieval-augmented generation) process
β βββ speech_to_text.py # Converts voice input to text using speech recognition
β βββ text_to_speech.py # Converts text responses to speech using pyttsx3
β βββ web_scraper_tool.py # Automates scraping tools using BeautifulSoup and SERP
β βββ web_scraping.py # Generalized web scraping logic for live content retrieval
β
βββ data/
β βββ scrapings/
β βββ scraped_content.txt # Stores temporarily scraped content from the web
β
βββ testFrontend/
β βββ FlaskApp/
β βββ static/
β β βββ script.js # JavaScript for dynamic frontend interaction
β β βββ style.css # Custom styling for the frontend
β βββ templates/
β βββ index.html # Main HTML template rendered by Flask
β
βββ app.py # Entry-point Flask server script integrating backend/frontend
βββ .env # Environment file for storing API keys and secrets
βββ .gitignore # Git ignore rules
βββ README.md # Project documentation
βββ requirements.txt # Python dependencies
git clone https://github.com/gupta-v/AI-Tutor.git
cd AI-Tutor
python -m venv env
source env/bin/activate # or `env\Scripts\activate` on Windows
pip install -r requirements.txt
Create a .env
file and add your API keys:
GEMINI_API_KEY=your_gemini_key
SERPAPI_KEY=your_serpapi_key
python app.py
Access the app at: http://localhost:5000
To use Ollama embeddings for document chunking and vector representation, follow these steps:
Download and install Ollama from the official site:
π https://ollama.com/download
After installation, ensure it is accessible in your terminal:
ollama --version
Pull a supported embedding model like mxbai-embed-large
or any other compatible model:
ollama pull mxbai-embed-large
Ollama typically runs as a background service. If not, you can start it manually:
ollama serve
The application uses Ollama to embed chunks of PDF/text using:
mxbai-embed-large
or any embedding model you configure- Embeddings are stored in FAISS vector store and used for cosine similarity-based retrieval
Ensure your .env
or config file includes proper references to use Ollama embeddings.
OLLAMA_MODEL=mxbai-embed-large
Youβre now ready to use Ollama with AI Tutorπ
"Explain Newton's second law"
- β Converts to text β Checks for PDF β Fetches embedded context (if available) β Responds via Gemini + TTS
"What is quantum entanglement?"
- β No documents β π Web scraping triggered β π§ Top 3 results shown + π Voice explanation
Upload a PDF β Ask "Summarize Chapter 2"
- β FAISS retrieves PDF chunks β Gemini refines answer using those β π§ Output is spoken
- β Add image-based query support (Multimodal)
- π Integrate quiz and flashcards generation from uploaded materials
- π Secure backend in GOlang/Python
- π Scaling and deployment using Docker + Firebase
- Developed by Vaibhav Gupta, Shweta Maurya, Shreya Pandey, Vartika Upadhyay
- Built using Google's Gemini API, FAISS, Langchain, and Ollama
- Open-source contributions are welcome π€