As my final year project for my Bachelor of Science in Computer Science at the University of Central Asia, I developed BarsAI—a desktop application aimed at making human-computer interaction more natural and intuitive. BarsAI enables intelligent conversations, document processing, and gesture control, bringing together cutting-edge AI technologies like conversational AI, Retrieval-Augmented Generation (RAG), and gesture recognition in a single, easy-to-use application. It runs locally using a GGUF-quantized version of the Llama 2 7B Chat model, allowing the chatbot to function entirely offline. For document-based queries, BarsAI taps into the power of Google’s Gemini Pro model, which requires an internet connection to process and retrieve relevant information. This project reflects my passion for AI and my desire to create a tool that blends intelligence with convenience.
🤖 Conversational AI - Built with the Llama-2-7B-Chat-GGUF model, the assistant can engage in intelligent and context-aware conversations.
📄 Document Query Processing - Powered by Google AI’s Gemini Pro model, the assistant can process and respond to queries based on uploaded documents (PDF, DOCX, CSV, Excel).
✋ Gesture Control - Users can control their computers using hand gestures.
📡 Offline Functionality - The AI Assistant can operate partially offline, ensuring accessibility even without an internet connection.
💻 User-Friendly Interface - Developed using PyQt5, the interface is designed for simplicity and ease of use.
- Python 3.10 or above
- CPU
- RAM 16 GB (if you have a smaller RAM size, download a smaller quantized version of Llama-2-7B)
To clone the repository, run the following commands:
git clone https://github.com/Shahrom-S/BarsAI.git
cd BarsAI
To install the necessary dependencies, run the following command:
pip install -r requirements.txt --default-timeout=100
Follow the steps below to configure your environment:
- Download Llama-2-7B-Chat-GGUF Model:
- Set Up API Keys for Document Processing:
from huggingface_hub import hf_hub_download
model_name_or_path = "TheBloke/Llama-2-7B-Chat-GGML"
model_basename = "llama-2-7b-chat.ggmlv3.q5_1.bin"
cache_path = hf_hub_download(repo_id=model_name_or_path,
filename=model_basename, cache_dir="C:/path/to/Project/directory", force_download=True)
model_path = hf_hub_download(repo_id=model_name_or_path, filename=model_basename)
To enable the document processing feature, obtain an API key from Google AI for the Gemini Pro model. Once you have the API key, add it to an .env
file in the project directory:
GOOGLE_API_KEY="Your_API_key_here"
Once you have set up the environment and dependencies, create and activate a virtual environment, and run the following command to start the application:
python interface.py
- Full Offline Capability: Integrating more powerful open-source models such as Llama-3-8B, which can replace Gemini Pro's API for RAG and eliminate dependencies on cloud-based services.
- Adding Gestures: Adding more gestures to recognize and perform a wider range of actions.
- Enhanced UI/UX: Further improving the interface for better usability and aesthetic appeal.