Skip to content

rooshikeshbhatt/Item-Inspector-AI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

6 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ•΅οΈβ€β™‚οΈ Item-Inspector AI

Python FastAPI PyTorch Transformers BLIP2 Ollama License AI Game Included

Visual AI for Product Condition Assessment & Human-like Reporting

Upload product images, let BLIP-2 understand the item, generate human-like condition reports with Phi-4, and enjoy the magic of zero-shot image-to-text reasoning. Also… there's a secret mini-game.

πŸ“š Table of Contents


✨ Features

  • AI Product Recognition – Detects object type: Watch, Shoe, Phone, etc.
  • Material Identification – Metal, Leather, Glass, Suede? We got it.
  • Visual Condition Tags – Custom per-item labels (like β€œscratched glass” or β€œtorn strap”).
  • Score Calculation – Evaluates product damage level and assigns a 4–10 score.
  • Natural Language Report – Uses Phi-4 LLM to describe condition in ~50 human-like words.
  • Frontend Upload UI – Drag, drop, analyze.

πŸ“Έ Banner

Banner


πŸ—‚ Project Structure

Item-Inspector AI/
β”œβ”€β”€ backend/
β”‚   β”œβ”€β”€ app.py               # This FastAPI file
β”‚   β”œβ”€β”€ requirements.txt
β”‚   β”œβ”€β”€ python_gpu_test.py   # Check if TensorFlow, pytorch & numpy runs on GPU
β”œβ”€β”€ frontend/
β”‚   └── index.html           # Web UI for uploading images
β”œβ”€β”€ sample_images/
β”‚   └── example_watch.jpg    # Example test image
β”œβ”€β”€ just_for_fun/
β”‚   └── tic_tac_toe.py       # Tic-Tac-Toe AI game
β”œβ”€β”€ README.md

πŸ›  Installation Guide

πŸ”— Prerequisites

  • Python 3.10+ (recommended Python 3.10.11 for GPU usage on windows)
  • GitHub Desktop or Git CLI
  • Ollama installed & phi4(phi4:14b-q4_K_M) model downloaded

πŸ“₯ 1. Clone the Repo

git clone https://github.com/Rooshikesh/Item-Inspector-AI.git

cd Item-Inspector-AI/backend

πŸ“¦ 2. Create Virtual Environment

python -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate

πŸ“¦ 3. Install Dependencies

pip install -r requirements.txt


🧠 4. Start Ollama with Phi-4

ollama run phi4:14b-q4_K_M

πŸš€ 5. Launch FastAPI

uvicorn app:app --reload

Go to: http://127.0.0.1:8000/docs


🌐 6. Use Web Interface (Optional)

Open frontend/index.html in your browser. Drag and drop product images.


⚑ Hardware & GPU Setup

If you're planning to run BLIP-2 on GPU for maximum performance, keep the following in mind:

βœ… Hardware Requirements

  • NVIDIA GPU with at least 8–12GB VRAM
    • Recommended: RTX 3060 or higher
  • CUDA-compatible drivers installed
    • Check GPU visibility with: nvidia-smi
  • Python: Version 3.10+

βœ… Python Environment for GPU

  • Install PyTorch with CUDA support:
    pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
    
  • Our code already includes:
    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
    torch_dtype=torch.float16
    
    This ensures your models run on GPU if available.

βœ… BLIP-2 Optimization Settings

  • Make sure BLIP-2 loads with:
    device_map="auto", torch_dtype=torch.float16
    
  • Images are correctly converted to RGB before inference:
    img = Image.open(file.file).convert("RGB")
    

πŸ§ͺ Verify GPU with Our Utility Script

Run the included python_gpu_test.py file to confirm if TensorFlow, PyTorch, and NumPy are GPU-ready:

cd backend
python python_gpu_test.py

This script will print the detected GPUs, framework versions, and whether each is using the GPU or CPU.


πŸ€– Bonus: Tic-Tac-Toe AI

When you need a break from debugging and BLIP-2 hallucinations:

cd just_for_fun
python tic_tac_toe.py
  • Supports easy, medium, and hard mode
  • Uses Minimax algorithm in Hard mode to destroy your confidence πŸ”₯

πŸ’‘ Technology Stack

  • BLIP-2 (Salesforce) - Vision Language
  • Phi-4 (Ollama) - Language Generation
  • FastAPI - Backend Framework
  • HTML/JS - Minimal Frontend
  • Hugging Face Transformers
  • PyTorch

🏷️ GitHub Topics

ai, blip2, phi4, fastapi, transformers, computer-vision, image-classification, product-inspection, natural-language-generation, multimodal-ai, semantic-analysis, ecommerce-ai, repairtech, humanlike-ai, condition-scoring, pytorch, webapi, backend, frontend, python


πŸ“„ License

MIT β€” use it, share it, modify it. Just don’t forget to smile when it works.


βœ‰οΈ Contact

Rooshikesh Bhatt rooshikeshbhatt@gmail.com