| Chat Interface (Laptop) | Repository Analysis(Mobile) |
|---|---|
![]() |
![]() |
A sophisticated chatbot that allows you to query any GitHub repository by analyzing its source code. The system clones the repository, processes the code files, and creates vector embeddings for semantic search.
- Repository Analysis: Clone and analyze any public GitHub repository
- Semantic Search: Find relevant code sections using natural language queries
- AI-Powered Answers: Get explanations and insights about the codebase
- Vector Database: Efficient storage and retrieval of code embeddings
- Modern UI: Clean, futuristic interface for optimal user experience
- Backend: Python, Flask
- Vector Database: ChromaDB
- Embeddings: Google Embedding 001
- LLM: Gemini 2.5 Flash
- Frontend: HTML, CSS, JavaScript
- Repository Handling: GitPython
graph TD
A[GitHub Repository URL] --> B[Clone Repository]
B --> C[Extract Code Files]
C --> D[Generate Embeddings]
D --> E[Store in ChromaDB]
E --> F[User Query]
F --> G[Similarity Search]
G --> H[Generate Response with Gemini 2.5 Flash]
H --> I[Display Results]
- Python 3.9+
- Git
- Google Cloud API key (for Gemini and Embeddings)
- Clone the repository:
git clone https://github.com/yourusername/repo-chatbot.git
cd repo-chatbot- Create and activate a virtual environment:
python -m venv venv
source venv/bin/activate # On Windows use `venv\Scripts\activate`- Install dependencies:
pip install -r requirements.txt- Create a
.envfile and add your API keys:
GOOGLE_API_KEY=your_google_api_key- Start the Flask server:
python app.py-
Open your browser to
http://localhost:5000 -
Enter a GitHub repository URL and click "Analyze"
-
Once processed, you can ask questions about the repository
POST /analyze- Submit a GitHub repository for analysisPOST /chat- Submit a query about the analyzed repositoryGET /status- Check processing status
Modify config.py for these settings:
# Chunking parameters
CHUNK_SIZE = 1000
CHUNK_OVERLAP = 200
# Database settings
PERSIST_DIRECTORY = "db"
COLLECTION_NAME = "code_embeddings"
# Model settings
EMBEDDING_MODEL = "models/embedding-001"
LLM_MODEL = "gemini-1.5-flash"To contribute to the project:
- Fork the repository
- Create a new branch (
git checkout -b feature-branch) - Commit your changes (
git commit -am 'Add new feature') - Push to the branch (
git push origin feature-branch) - Create a new Pull Request
Run the test suite with:
python -m pytest tests/This project is licensed under the MIT License - see the LICENSE file for details.
- Google for the Gemini models and embeddings
- ChromaDB team for the vector database
- LangChain for the LLM integration framework
For issues or questions, please open an issue on GitHub or contact jasjeev99@gmail.com


