This project uses LangChain, Ollama, and Chroma to answer questions about pizza restaurants based on reviews in realistic_restaurant_reviews.csv
. It leverages CUDA, cuDNN, and PyTorch for GPU acceleration (tested on RTX 3060, 12 GB VRAM) and monitors GPU usage with pynvml
in Monitor_cuda.py
.
- Answers questions (e.g., “What’s the best pizza in town?”) using review data.
- GPU-accelerated with
llama3.2:latest
(2.0 GB) andmxbai-embed-large:latest
(669 MB). - Uses Chroma for vector search (top 5 reviews).
- Monitors VRAM with
pynvml
andnvidia-smi
. - Interactive CLI for questions.
- Hardware: NVIDIA GPU (e.g., RTX 3060, 12 GB VRAM), 16 GB RAM.
- Software: Windows 10/11 (tested), Python 3.10+, NVIDIA driver 566.36+, CUDA 12.6/12.7, Ollama.
- Dataset:
realistic_restaurant_reviews.csv
withTitle
,Review
,Rating
,Date
.
-
Install NVIDIA Drivers and CUDA:
- Get NVIDIA driver from NVIDIA.
- Verify:
nvidia-smi
(should show CUDA 12.6/12.7). - cuDNN is bundled with Py pytorch.
-
Install Python and Virtual Environment:
python -m venv venv .\venv\Scripts\Activate.ps1 # Windows
-
Install Dependencies:
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu126 pip install -r requirements.txt
requirements.txt
:langchain langchain-ollama langchain-chroma pandas pynvml
-
Install Ollama:
- Download from ollama.ai.
- Pull models:
ollama pull llama3.2:latest ollama pull mxbai-embed-large:latest
- Verify:
Should show:
ollama list
NAME ID SIZE MODIFIED llama3.2:latest a----------5 2.0 GB Recently mxbai-embed-large:latest 4----------7 669 MB Recently
-
Prepare Dataset:
- Place
realistic_restaurant_reviews.csv
in project root. - Format:
Title,Review,Rating,Date "Great Pizza","Crispy crust, fresh toppings!",5,"2023-10-01"
- Place
pizza-restaurant-review/
├── main.py # Runs Q&A system
├── vector.py # Handles embeddings and vector database
├── Monitor_cuda.py # Monitors GPU memory
├── requirements.txt # Python dependencies
├── realistic_restaurant_reviews.csv # Review dataset
├── chrome_langchain_db/ # Chroma database (auto-generated)
└── venv/ # Virtual environment
- Data Loading (
vector.py
):- Reads CSV, combines
Title
andReview
, createsDocument
objects withrating
,date
.
- Reads CSV, combines
- Embedding (
vector.py
):- Uses
mxbai-embed-large:latest
(669 MB, 1024 dimensions) for review embeddings. - Stores in Chroma (
chrome_langchain_db
), retrieves top 5 reviews.
- Uses
- Q&A (
main.py
):- Takes user question, retrieves reviews, uses
llama3.2:latest
(2.0 GB) to answer.
- Takes user question, retrieves reviews, uses
- GPU Acceleration:
- Ollama uses PyTorch with CUDA/cuDNN, ~4-5 GB VRAM.
- Monitoring:
Monitor_cuda.py
usespynvml
for GPU stats.nvidia-smi
showsollama.exe
/python.exe
usage.
- Create Environment:
python -m venv venv
- Activate Environment:
.\venv\Scripts\Activate.ps1
- Run Q&A:
python main.py
- Ask questions (e.g., “What’s the best pizza in town?”), type
q
to quit. - Example:
------------------------------- Ask your question (q to quit): whats the best pizza in town Based on reviews, [Pizza Place] has the best pizza for its crispy crust.
- Ask questions (e.g., “What’s the best pizza in town?”), type
- Run GPU Monitor:
python Monitor_cuda.py
- Outputs:
Total GPU memory: 12288.00 MB Free GPU memory: ~7292.00 MB Used GPU memory: ~4824.00 MB
- Outputs:
- In-Script (
Monitor_cuda.py
):import pynvml try: pynvml.nvmlInit() handle = pynvml.nvmlDeviceGetHandleByIndex(0) mem_info = pynvml.nvmlDeviceGetMemoryInfo(handle) print(f"Total GPU memory: {mem_info.total / 1024**2:.2f} MB") print(f"Free GPU memory: {mem_info.free / 1024**2:.2f} MB") print(f"Used GPU memory: {mem_info.used / 1024**2:.2f} MB") except pynvml.NVMLError as e: print(f"NVML Error: {e}") finally: pynvml.nvmlShutdown()
- External (
nvidia-smi
):nvidia-smi
- Look for
ollama.exe
/python.exe
using ~4-5 GB VRAM. - Continuous:
nvidia-smi --query --display=COMPUTE,MEMORY -l 2
- Look for
- Python Packages (
requirements.txt
):langchain
langchain-ollama
langchain-chroma
vector
pandas
pynvml
- PyTorch with CUDA:
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu126
- Ollama Models:
llama3.2:latest
(a----------5, 2.0 GB)mxbai-embed-large:latest
(4----------7, 669 MB)
- NVIDIA Stack:
- CUDA 12.6/12.7
- cuDNN (bundled with PyTorch)
- NVIDIA driver 566.36+
- File:
realistic_restaurant_reviews.csv
- Format: CSV with:
Title
: Review titleReview
: Review textRating
: 1-5Date
: e.g., “2023-10-01”
- Usage: Loaded by
vector.py
for embeddings.