โ ๏ธ Note: The demo video may show an earlier version of the platform due to continuous development and feature updates. The current system includes significantly more advanced AI capabilities and architectural improvements.![]()
![]()
Step Mentor is a next-generation, AI-native educational platform engineered for JEE aspirants and STEM students. Built on cutting-edge generative AI infrastructure, the platform leverages state-of-the-art transformer architectures, multimodal learning paradigms, and advanced retrieval-augmented generation (RAG) systems to deliver personalized, explainable, and adaptive learning experiences.
Democratizing access to world-class AI-powered education through transparent, personalized, and contextually-aware learning systems.
graph TB
subgraph "User Interface Layer"
UI[Web Application<br/>Flask + Jinja2 + Bootstrap]
Mobile[Mobile App<br/>React Native]
API[REST API Gateway<br/>Flask-RESTful + OpenAPI]
end
subgraph "AI/ML Orchestration Layer"
LLM[Multimodal LLM Engine<br/>Gemini Pro + Vision]
RAG[RAG Pipeline<br/>LangChain + LlamaIndex]
VectorDB[Vector Database<br/>Pinecone + Weaviate]
Agents[Multi-Agent System<br/>CrewAI + AutoGen]
end
subgraph "Core Services"
QGen[Question Generation<br/>Controlled Synthesis]
Solver[Solution Engine<br/>CoT + ToT Reasoning]
Tutor[AI Tutor<br/>RLHF + Persona Modeling]
Analytics[Learning Analytics<br/>Recommendation Engine]
end
subgraph "Data Layer"
KG[Knowledge Graph<br/>Neo4j + RDF]
Content[Content Repository<br/>MongoDB + S3]
UserData[User Analytics<br/>PostgreSQL + Redis]
end
UI --> API
Mobile --> API
API --> LLM
API --> RAG
LLM --> QGen
LLM --> Solver
LLM --> Tutor
LLM --> Agents
RAG --> VectorDB
Agents --> QGen
Agents --> Solver
Agents --> Tutor
Analytics --> KG
Analytics --> Content
Analytics --> UserData
flowchart TB
subgraph Input_Processing
A[Multimodal Input: Text / Image / Audio / LaTeX]
B[Input Tokenization: SentencePiece + Vision Transformer]
C[Semantic Preprocessing: Named Entity Recognition and OCR]
end
subgraph Core_AI_Engine
D[Prompt Engineering: Chain-of-Thought and Few-Shot]
E[Multimodal LLM: Parameter-Efficient Fine-Tuning]
F[Attention Mechanisms: Multi-Head Self-Attention]
end
subgraph Knowledge_Retrieval
G[Semantic Search: Dense Passage Retrieval]
H[Vector Similarity: Cosine and Euclidean Distance]
I[Context Ranking: Cross-Encoder Reranking]
end
subgraph Solution_Generation
J[Reasoning Engine: Tree-of-Thoughts and Beam Search]
K[Solution Synthesis: Controlled Generation]
L[Verification Layer: Self-Consistency Checking]
end
subgraph Output_Enhancement
M[Explainability: LIME, SHAP, Attention Maps]
N[Personalization: User Modeling and Adaptive Content]
O[Quality Assurance: Automated Fact-Checking]
end
%% Main pipeline flow
A --> B --> C --> D
D --> E
E --> F
%% Knowledge retrieval branch
D --> G
G --> H --> I
%% Solution generation flow
F --> J
I --> J
J --> K --> L
%% Output enhancement
K --> M
L --> N
M --> O
N --> O
graph LR
subgraph "Knowledge Sources"
A[Textbooks<br/>NCERT + Reference]
B[Question Banks<br/>JEE + NEET + Board]
C[Formula Database<br/>Mathematical Expressions]
D[Concept Maps<br/>Topic Hierarchies]
end
subgraph "Preprocessing Pipeline"
E[Document Chunking<br/>Semantic Segmentation]
F[Embedding Generation<br/>Sentence-BERT + MPNet]
G[Metadata Extraction<br/>Topic + Difficulty + Source]
end
subgraph "Vector Store"
H[Dense Vectors<br/>768-dim Embeddings]
I[Sparse Vectors<br/>BM25 + TF-IDF]
J[Hybrid Search<br/>Dense + Sparse Fusion]
end
subgraph "Retrieval Engine"
K[Query Understanding<br/>Intent Classification]
L[Similarity Matching<br/>Approximate Nearest Neighbor]
M[Context Fusion<br/>Reciprocal Rank Fusion]
end
subgraph "Generation Pipeline"
N[Prompt Construction<br/>Template + Context]
O[LLM Inference<br/>Conditional Generation]
P[Post-processing<br/>Format + Validate]
end
A --> E
B --> E
C --> E
D --> E
E --> F --> G
F --> H
G --> I
H --> J
I --> J
J --> K --> L --> M
M --> N --> O --> P
style J fill:#ff6b6b
style L fill:#4ecdc4
style O fill:#45b7d1
graph TB
subgraph "API Gateway"
GW[Kong Gateway<br/>Rate Limiting + Auth]
end
subgraph "Authentication Service"
AUTH[OAuth 2.0 + JWT<br/>Keycloak]
end
subgraph "Core AI Services"
QG[Question Generation<br/>Python + FastAPI]
PS[Problem Solver<br/>Python + Celery]
CM[Content Manager<br/>Node.js + Express]
LA[Learning Analytics<br/>Python + Pandas]
end
subgraph "ML Services"
MODEL[Model Serving<br/>TensorFlow Serving + ONNX]
TRAIN[Training Pipeline<br/>Kubeflow + MLflow]
MONITOR[Model Monitoring<br/>Prometheus + Grafana]
end
subgraph "Data Services"
CACHE[Redis Cluster<br/>Session + Cache]
DB[PostgreSQL<br/>User + Analytics]
NOSQL[MongoDB<br/>Content + Metadata]
SEARCH[Elasticsearch<br/>Full-text Search]
end
subgraph "External Services"
CDN[CloudFront CDN<br/>Static Assets]
STORAGE[S3 Compatible<br/>File Storage]
QUEUE[RabbitMQ<br/>Message Queue]
end
GW --> AUTH
GW --> QG
GW --> PS
GW --> CM
GW --> LA
QG --> MODEL
PS --> MODEL
CM --> NOSQL
LA --> DB
MODEL --> TRAIN
MODEL --> MONITOR
QG --> CACHE
PS --> QUEUE
CM --> STORAGE
LA --> SEARCH
style GW fill:#ff7675
style MODEL fill:#74b9ff
style DB fill:#00b894
style CACHE fill:#fdcb6e
-
๐ค Multimodal Foundation Models
- Google Gemini Pro for advanced reasoning and multimodal understanding
- Gemini Vision for image and diagram interpretation
- Custom fine-tuned models via LoRA/QLoRA for domain-specific tasks
- Integration with open-source models (Llama 2, Mistral)
-
๐ Multi-Agent Architecture (In Development)
- Specialized AI agents for different educational domains
- Collaborative problem-solving through agent coordination
- Teacher Agent for pedagogy and explanation generation
- Student Agent for personalized learning path optimization
- Evaluator Agent for assessment and feedback
-
๐งช Parameter-Efficient Fine-Tuning
- Low-Rank Adaptation (LoRA) for domain-specific tasks
- Prefix Tuning for prompt optimization
- Adapter layers for modular knowledge injection
- Quantization-Aware Training (QAT) for efficient inference
-
๐ฏ Retrieval-Augmented Generation (RAG)
- Dense Passage Retrieval with FAISS/Pinecone
- Hybrid search combining dense and sparse vectors
- Contextual reranking with Cross-Encoder models
- Real-time knowledge base updates
-
๐ณ Tree-of-Thoughts (ToT) Reasoning
- Multi-step problem decomposition
- Parallel solution path exploration
- Self-correction and validation mechanisms
- Confidence scoring for solution quality
-
๐ Chain-of-Thought (CoT) Prompting
- Step-by-step reasoning visualization
- Intermediate step validation
- Error detection and correction
- Pedagogical explanation generation
-
๐จ Multimodal Understanding
- Mathematical expression recognition (OCR + LaTeX)
- Diagram and graph interpretation
- Handwritten text recognition
- Visual-textual context integration
-
๐ฏ Adaptive Learning Algorithms
- Bayesian Knowledge Tracing
- Item Response Theory (IRT) modelling
- Deep Knowledge Tracing with LSTM/Transformer
- Personalised difficulty adjustment
-
๐ง Student Modeling
- Cognitive load assessment
- Learning style identification
- Knowledge gap analysis
- Optimal challenge zone targeting
- Dynamic Question Synthesis: Context-aware question generation using controlled text generation and constraint satisfaction
- Adaptive Difficulty Scaling: Real-time difficulty adjustment based on student performance metrics and learning trajectories
- Multimodal Content Creation: Automated generation of visual aids, diagrams, and interactive elements
- Transparent Reasoning Chains: Step-by-step solution breakdowns with confidence intervals and uncertainty quantification
- Attention Visualization: Interactive attention maps showing model focus areas during problem-solving
- Counterfactual Explanations: "What-if" scenarios to demonstrate solution sensitivity to input variations
- Context-Aware Dialogue Management: Multi-turn conversation handling with memory and persona consistency
- Socratic Questioning: Guided discovery learning through strategic questioning techniques
- Emotional Intelligence: Sentiment analysis and empathetic response generation
- Real-time Performance Tracking: Comprehensive learning analytics dashboard with predictive modeling
- Mastery-Based Progression: Competency-based advancement with granular skill assessment
- Predictive Intervention: Early warning systems for at-risk students using ML anomaly detection
-
๐ค Multi-Agent AI System
- Collaborative AI agents with specialized roles and expertise
- Dynamic agent orchestration for complex problem-solving
- Inter-agent communication and knowledge sharing protocols
-
๐ง Advanced Neural Architectures
- Mixture of Experts (MoE) models for specialized domains
- Retrieval-Augmented Generation with dynamic knowledge updates
- Constitutional AI for safety and alignment
-
๐ Distributed Learning Infrastructure
- Federated learning across multiple educational institutions
- Edge computing for reduced latency and improved privacy
- Blockchain-based credential verification and achievement tracking
-
๐ฏ Next-Generation Personalization
- Neuro-symbolic reasoning for explainable AI decisions
- Causal inference for understanding learning pathways
- Quantum-inspired optimization for resource allocation
Flask 3.0 + Jinja2 + Bootstrap 5
โโโ Template Engine: Jinja2 + Custom Macros
โโโ CSS Framework: Bootstrap 5 + Custom SCSS
โโโ JavaScript: Vanilla JS + Alpine.js + HTMX
โโโ Math Rendering: KaTeX + MathJax
โโโ Visualization: Chart.js + D3.js
โโโ Testing: Pytest + Selenium
Python 3.11 + Flask + SQLAlchemy
โโโ Web Framework: Flask + Flask-RESTful + Blueprints
โโโ Database ORM: SQLAlchemy + Alembic
โโโ Task Queue: Celery + Redis + RabbitMQ
โโโ Database: PostgreSQL + MongoDB + Neo4j
โโโ Caching: Redis Cluster + Memcached
โโโ Search: Elasticsearch + OpenSearch
โโโ Monitoring: Prometheus + Grafana + Sentry
PyTorch 2.0 + Transformers + LangChain
โโโ LLM Integration: Google Gemini Pro API + Vertex AI
โโโ Model Serving: TensorFlow Serving + ONNX Runtime
โโโ Multi-Agent Framework: CrewAI + AutoGen + LangGraph
โโโ Training: Kubeflow + MLflow + Weights & Biases
โโโ Vector Store: Pinecone + Weaviate + FAISS
โโโ NLP: spaCy + NLTK + Hugging Face Transformers
โโโ Computer Vision: OpenCV + Pillow + Tesseract
โโโ Optimization: Ray + Optuna + Hyperopt
Kubernetes + Docker + Helm
โโโ Cloud Platform: AWS/GCP/Azure
โโโ CI/CD: GitHub Actions + ArgoCD
โโโ Service Mesh: Istio + Envoy
โโโ Monitoring: ELK Stack + Jaeger
โโโ Security: HashiCorp Vault + SOPS
โโโ Infrastructure: Terraform + Ansible
Apache Airflow + dbt + Great Expectations
โโโ Data Ingestion: Apache Kafka + Kinesis
โโโ Processing: Apache Spark + Pandas + Polars
โโโ Storage: Delta Lake + Apache Iceberg
โโโ Warehousing: Snowflake + BigQuery
โโโ Lineage: DataHub + Apache Atlas
# System Requirements
Python 3.11+
Node.js 18+
Docker 24+
Kubernetes 1.28+
# 1. Clone the repository
git clone https://github.com/SaurabMishra12/Step-Mentor0.git
cd Step-Mentor0
# 2. Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# 3. Install dependencies
pip install -r requirements.txt
# 4. Set up environment variables
cp .env.example .env
# Add your Gemini API key and other configurations
# 5. Initialize databases
python init_db.py
# 6. Start the Flask development server
python app.py
# Using Docker Compose
docker-compose -f docker-compose.prod.yml up -d
# Using Kubernetes
kubectl apply -f k8s/
helm install step-mentor ./helm-chart
- Core RAG pipeline implementation
- Basic multimodal support
- Web application MVP
- User authentication system
- Advanced reasoning capabilities with Gemini Pro
- Personalization engine
- Multi-agent system foundation
- Mobile application
- Advanced analytics dashboard
- Complete multi-agent architecture deployment
- Microservices architecture migration
- Multi-language support
- Real-time collaboration features
- Advanced ML ops pipeline
- Federated learning implementation
- Quantum-inspired algorithms
- AR/VR integration
- Blockchain-based credentials
- Neuro-symbolic reasoning
For contributions, write to me at saurab23@iisertvm.ac.in
This project is licensed under the MIT License - see the LICENSE file for details.
- Research Papers: Attention Is All You Need, RAG, Chain-of-Thought Prompting
- Open Source Libraries: Transformers, LangChain, FastAPI, React
- Educational Partners: NCERT, JEE Preparation Institutes
๐ Empowering the next generation of learners with cutting-edge AI technology